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Title: GENE CLUSTER INVOLVED IN NOGALAMYCIN BIOSYNTHESIS, 
AND ITS USE IN PRODUCTION OF HYBRID ANTIBIOTICS 

PRELIMINARY AMENDMENT 

Box Non-Fee Amendment: 

Commissioner for Patents 
Washington, D.C. 20231 

Sir: 

Please enter the following amendments to the claims and 
abstract prior to the examination of the application. 
IN THE CLAIMS : 

Please amend claims 3, 7, and 12, and add new claims 16-26 
as follows. A copy of the marked-up version of amended claims 
3 r l r and 12 are attached to this Preliminary Amendment. 
3, (Amended) A recombinant DNA, which comprises the DNA 
fragment according to claim 1, cloned in a plasmid replicating 
in Streptomyces . 

7. (Amended) A process for the production of hybrid compounds, 
comprising transferring the DNA fragment according to claim 1 



into a Streptomyces host, cultivating the recombinant strain 
obtained, and isolating the compounds produced. 

12. (Amended) A process for the production of hybrid compounds, 
comprising transferring at least one of the genes selected from 
the group consisting of snoqJ , snogA, snoaM, snogN, snoaG, snoqC, 
snoqKf snoaL, snoK, snoqD, sno¥t , snoqE, snoL, snoO and sno&F into 
a Streptomyces host, said genes being derived from the DNA 
fragment of claim 1, cultivating the recombinant strain obtained, 
and isolating the compounds produced. 

16. (New) A recombinant DNA, which comprises the DNA fragment 
according to claim 2, cloned in a plasmid replicating in 
Streptomyces . 

17. (New) The recombinant DNA according to claim 16, which is 
the plasmid pSY15c, comprising a 1.4 kb BamEI-SacI fragment from 
the plasmid pSY42 and a 1.1 kb Mlul-Kpnl fragment from the 
plasmid pSY43. 

18. (New) A process for the production of hybrid compounds, 
comprising transferring the DNA fragment according to claim 2 
into a Streptomyces host, cultivating the recombinant strain 
obtained, and isolating the compounds produced. 



-2- 



19. (New) The process according to claim 18 , wherein the 
Streptomyces host is a Streptomyces galilaeus host. 



20. (New) The process according to claim 19, wherein the 
Streptomyces galilaeus host is selected from the strains H026, 
H039, H063 and H075, which are mutant strains of S. galilaeus 
ATCC 31615. 

21. (New) The process according to claim 19, wherein an 
anthracycline is produced, which has the following formula I 



22. (New) The process according to claim 19, wherein an 
anthracyclinone is produced, which has the following formula II 





(ID 



-3- 



23. (New) A process for the production of hybrid compounds, 
comprising transferring at least one of the genes selected from 
the group consisting of snogj, snoqA, snoaM, snogN , snoaG, snoqC, 
snoqK, snoaL, snoK, snogD, snoW, snogE, snoL r snoO and snoaF into 
a Streptomyces host, said genes being derived from the DNA 
fragment of claim 2, cultivating the recombinant strain obtained, 
and isolating the compounds produced, 

24. (New) The process according to claim 23 , wherein the gene 
snoaL encoding NAME cyclase is transferred into a Streptomyces 
host . 

25. (New) The process according to claim 23, wherein at least 
one of the genes snoqD and snoqE encoding glycosyl transferases 
is transferred into a Streptomyces host. 

26. (New) The process according to claim 23, wherein at least 
one of the genes snogJ, snogN, snogC, snogK and snogA affecting 
the formation of nogalamine and nogalose is transferred into a 
Streptomyces host. 

IN THE ABSTRACT : 

Please amend the Application to include the attached 
Abstract of the Disclosure on a separate page following the 
claims . 
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REMARKS 

Entry of the amendments to the claims and abstract before 
examination of the application is respectfully requested. The 
claims have been amended to remove multiple dependencies. The 
abstract submitted herewith is the same as in the original, 
however, it has been submitted on a separate sheet. 

If there are any questions regarding this Preliminary 
Amendment or this application in general, a telephone call to the 
undersigned would be appreciated since this should expedite the 
prosecution of the application for all concerned. 

It is respectfully requested that, if necessary to effect 
a timely response, this paper be considered as a Petition for an 
Extension of Time sufficient to effect a timely response and 
shortages in other fees, be charged, or any overpayment in fees 
be credited, to the Account of Evenson, McKeown, Edwards & 
Lenahan, P.L.L.C., Deposit Account No. 05-1323 (Docket 
#1574/49849) . 



Respectfully submitted, 



April 23, 2001 




Donald D. Evenson 
Registration No. 26,160 



DDE : OAT : vca 

EVENSON, McKEOWN, EDWARDS 

& LENAHAN, P.L.L.C. 
1200 G Street, N.W., Suite 700 
Washington, DC 20005 
Telephone No.: (202) 628-8800 
Facsimile No. : (202) 628-8844 
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Marked-Up Version of Amendment 

In the Claims: 

3. (Amended) A recombinant DNA, which comprises the DNA 
fragment according to claim 1 [or 2] , cloned in a plasmid 
replicating in Streptomyces . 

7. (Amended) A process for the production of hybrid compounds , 
comprising transferring the DNA fragment according to claim 1 [or 
2] into a Streptomyces host, cultivating the recombinant strain 
obtained, and isolating the compounds produced. 

12. (Amended) A process for the production of hybrid compounds, 
comprising transferring at least one of the genes selected from 
the group consisting of snoqj, snogA, snoaM, snogN, snoaG, snogC, 
snoqK r snoaL, snoK, snogD, snoW, snogE, snoL, snoO and snoaF into 
a Streptomyces host, said genes being derived from the DNA 
fragment of claim 1 [or 2], cultivating the recombinant strain 
obtained, and isolating the compounds produced. 
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Gene cluster involved in nogalamycin biosynthesis, and its use in production of 
hybrid antibiotics 

Field of the invention 

5 

This invention relates to the gene cluster for nogalamycin biosynthesis derived from 
Streptomyces nogalater, and the use of the genes therein to obtain novel hybrid 
antibiotics for drug screening. 

10 Background of the invention 

Anthracyclines are antitumor antibiotics, mainly produced by Streptomyces sp. Dauno- 
mycin family of anthracyclines is commercially most important, since almost all of the 
around ten anthracyclines currently in clinical use, or in late clinical trials for cytotoxic 

15 drugs, belong to this family. Despite the long history of anthracyclines, three decades or 
so, the studies on their biosynthesis are still going on, and there is further interest to 
obtain novel molecules for the development of cancer chemotherapeutics. A method 
currently used for finding novel molecules for drug screening is genetic engineering. 
Cloning the genes for anthracycline biosynthesis facilitates the production of hybrid 

20 anthracyclines, as well as their use in combinatorial biosynthesis to generate novel 
molecules. 

Nogalamycin, which was first described by Bhuyan and Dietz in 1965, is an anthra- 
cycline antibiotic produced by Streptomyces nogalater. It is highly active against tumor 

25 cells, whereas toxic properties of this compound have prevented its progress to clinical 
trials (Bhuyan and Smith, 1975). However, menogaril (7-O-methylnogarol) is a 
semisynthetic derivative of nogalamycin, and its value in the treatment of cancer has 
been studied (e.g. Yoshida et ai, 1996), the interest being now mainly in Japan. 
Structurally nogalamycin (Fig. 1) differs from most other anthracyclines, as e.g. from 

30 the daunomycin family, in two noteworthy features: (i) The stereochemistry at position 
nine is opposite, and (ii) it has a sugar moiety, in which nogalamine is attached at 
position 1 by a typical glycosidic bond, but it is also attached to carbon 2 by an 
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extraordinary C-C bond. Structural elucidation of nogalamycin was reported by Wiley 
et al (1977). Furthermore, biosynthetic studies of nogalamycin have been published by 
Wiley et al in 1978 giving information of the building blocks: The aglycone moiety is 
built from ten acetates; the neutral sugar, nogalose, is derived from glucose; and methyl 
groups of both of the sugars, nogalamine and nogalose, are transferred from methionine. 
The origin of nogalamine was not clearly solved by Wiley, but most probably nogal- 
amine is also derived from glucose. 

Molecular cloning of biosynthesis genes for anthracyclines has facilitated the studies on 
molecular genetics, providing tools for rational modifications of the structures, while 
also for surprising combinations with other antibiotics. Most of the interest has focused 
on daunomycin biosynthesis genes, as reported in several publications (Lomovskaya et 
al., 1998; Rajgarhia and Strohl, 1997 and references therein). Some genes for aclacino- 
mycin biosynthesis from S. galilaeus (Fujii and Ebizuka, 1997) and for rhodomycin 
biosynthesis from S. purpurascens (Niemi et al, 1994) have been cloned as well. We 
have cloned the biosynthesis genes for nogalamycin, and successfully used the genes for 
producing hybrid anthracyclines. Most of the genes are involved in polyketide pathway, 
being responsible for the formation of a tricyclic intermediate, and they are reported in 
Ylihonko et al, 1996a and b, and by Torkkell et al, 1997. Despite the advances in 
molecular cloning, the biosynthetic pathway from glucose to sugars found in anthra- 
cyclines is still mainly hypothetical. 

Regarding the genes for deoxyhexose pathway, Madduri et al (1998) have reported that 
a gene derived from avermectin biosynthesis cluster caused the production of hybrid 
anthracyclines altering the sugar moiety when transferred into an S. peucetius mutant. 
The product obtained was epirubicin, a commercially important anthracycline. In this 
case a hydroxy group in the daunosamine moiety was in the opposite stereochemistry 
due to the action of an avermectin biosynthesis gene. S. galilaeus has been used as the 
host to prepare hybrid anthracyclines using the genes derived from rhodomycin pathway 
from 5. purpurascens (Niemi et al, 1994), and from nogalamycin biosynthesis cluster 
from S. nogalater (Ylihonko et al, 1996a). The genes for nogalamycin pathway were 
used to generate the hybrid anthracycline production in S. steffisburgensis producing 
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typically steffimycin (Kunnari et al y 1997). Previously, biosynthesis genes for actino- 
rhodin have been expressed in S. galilaeus resulting in the formation of aloesaponarin 
(Strohl et al., 1991). These hybrid compounds were modified in the aglycone moiety. 

5 Summary of the invention 

The present invention concerns a gene cluster of Streptomyces nogalater, most of the 
genes of the cluster being derived from the deoxyhexose pathway for nogalamine and 
nogalose. Expressing a DNA fragment of the said region in S, galilaeus, which produces 

10 aclacinomycins, hybrid anthracyclines are obtained, wherein the aglycone moiety is 
derived from S. galilaeus, whereas the sugar moiety is characteristic neither to S. 
nogalater nor to S. galilaeus. Furthermore, when inserting the gene included in said 
cluster, encoding a cyclase for nogalamycin, into a suitable plasmid construction, 
nogalamycinone is obtained, which is the aglycone of nogalamycin. Since the stereo- 

15 chemistry of nogalamycin differs from most other anthracyclines, using this gene 
enables the preparation of C-9 stereoisomers of the anthracyciine molecules. 

Detailed description of the invention 

20 The experimental procedures of the present invention are methods conventional in the 

art. The techniques not described in detail here are given in the manuals by Hopwood et 
al "Genetic manipulation of Streptomyces: a laboratory manual" The John Innes 
Foundation, Norwich (1985) and by Sambrook et al (1989) "Molecular cloning: a 
laboratory manual". The publications, patents and patent applications cited herein are 

25 given in the reference list in their entirety. 

The present invention concerns particularly the gene cluster for nogalamycin 
biosynthesis (*Sno5-cluster) causing the production of hybrid antibiotics with modifica- 
tions in the sugar moiety. The invention concerns in specific the use of the genes for 
30 nogalamine/nogalose biosynthesis to generate hybrid antibiotics modified in sugar 

moieties. The invention also concerns the use of a specific cyclase gene included in the 
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gene cluster of the Invention, to generate the C-9 stereoisomers of typical anthra- 
cyclines. 

The gene cluster according to the present invention is linked to the earlier reported 
clusters for nogalamycin biosynthesis. The starting point of the present invention was 
the gene cluster for nogalamycin chromophor (International Patent Application WO 
96/10581). Subsequently, we have found some genes for the deoxyhexose pathway of 
nogalamycin biosynthesis (Torkkell et al, 1997), and a part of the fragment comprising 
said genes was used to clone the genes for this invention. 

The biosynthesis genes for nogalamycin can be isolated from Streptomyces sp., particu- 
larly from S. nogalater, which produces nogalamycin. Species which produce nogaia- 
mycin-like anthracyclines can also be used, e.g. S, violaceochromogenes producing 
arugomycin (Kawai et al, 1987), or S. avidinii producing avidinorubicin (Aoki et al, 
1991). 

Genomic DNA of a Streptomyces strain carrying the genes for nogalamycin biosynthesis 
is used in preparing a genomic library. Suitable gene fragments for cloning may be 
obtained by any frequently digesting restriction enzyme. Typically Sau3Al is used. The 
isolated fragments could be inserted by ligation in any Escherichia coli vector such as a 
plasmid, a phagemid, a phage, or a cosmid. A cosmid vector is preferred since it 
enables the cloning of large DNA fragments. A cosmid vector such as pFD666 (ATCC 
No. 77286) is suitable for this purpose, as it enables cloning of the fragments of about 
40 kb. The BamKl site of pFD666, giving sticky ends to the Sau3Al fragments may be 
used for cloning. Commercially available kits may be used to pack the DNA in phage 
particles. Various £. coli strains can be used for the infection by the DNA packed. An 
appropriate JE. coli strain is, e.g. XLlBlue MRF\ which is deficient in several restric- 
tion systems. 

Using E, coli as a host strain for the genomic library, hybridization is an advantageous 
screening strategy. The probe for hybridization may be any known fragment derived 
from the nogalamycin gene cluster, but a short fragment of about 1 kb derived from one 
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end of the biosynthetic region previously cloned is preferred. Colonies for the genomic 
library are transferred for filter hybridization to membranes, preferably to nylon 
membranes. Since the average size for a genomic DNA fragment is 40 kb, 2300 
colonies gave 99.99% probability to find the expanded region for nogaiamycin 
biosynthesis. Any method for hybridization may be used but, in particular, the DIG 
System (Boehringer Mannheim, GmbH, Germany) is useful. Since the probe is 
homologous to the hybridized DNA, it is preferable to carry out the stringent washes of 
hybridization at 70°C in a low salt concentration according to Boehringer Mannheim's 
manual "DIG System User's Guide for Filter Hybridization". At least 80% homology is 
suggested to be needed for a DNA fragment to bind a probe in the conditions used for 
washes. 

Using this protocol, seven clones out of about 5000 gave positive signals, and were 
picked up for DNA isolation. Restriction mapping is an appropriate technique for 
characterizing the clones. The positive clones may be digested with convenient restric- 
tion enzymes to demonstrate the physical linkage map of the DNA fragments. The 
cosmid used for cloning was a shuttle cosmid replicating in both E. coli and 
Streptornyces sp. However, the transfer of the recombinant cosmids in S. lividans TK24, 
which is a typically used laboratory strain in cloning Streptornyces, resulted in deletions, 
and was omitted. Instead, we rather used in the expression studies the plasmid pIJ486, a 
high copy number Streptornyces plasmid. However, any plasmid being able to stably 
replicate in Streptornyces may be used for this purpose. 

Two Bglll fragments of one of the clones were separately inserted into pIJ486 vectors, 
and the two plasmids obtained were transferred into a primary host, S. lividans TK24. 
The recombinant plasmids obtained (pSY42 and pSY43), containing a 10 kb and a 7kb 
fragment from S. nogalater genomic DNA, respectively, were isolated from the primary 
host and further introduced into other Streptornyces strains by protoplast transformation. 
The recombinant plasmid containing the 10 kb fragment caused the production of hybrid 
anthracyclines in the S. galilaeus mutant strain H039, which endogenously produces 
aklavinone-rhodinose-rhodinose-rhodinose. A few other S. galilaeus strains (H075, 
H026, H063) mutated in deoxyhexose pathway for sugars in aclacinomycin were used in 
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transformation, and new hybrid compounds were obtained. Since the structure of 
nogalamycin is almost unique among anthracyclines, the plasmids could be transferred 
to other anthracycline-producing strains, such as S. peucetius, which produces dauno- 
mycin, and S. purpurascens, which produces rhodomycins, to modify the structures of 
5 the characteristic antibiotics. 

As the cloned cluster was linked to nogalamycin biosynthesis region already known, its 
ability to generate the modification in sugar moiety suggested the presence of the genes 
for deoxyhexose pathway. However, sequencing is necessary to deduce the function of 

10 the genes in the cluster cloned. The DNA fragments of 10 kb and 7 kb were further 
inserted into the plasmid pSL1190 for subcloning. Sequencing strategies such as a 
deletion set of the DNA fragments, shotgun cloning or primer walking could be used, 
but we prefer to use restriction fragments for subcloning. Using ABI PRISM system 
(Perkin-Elmer) for sequencing it is possible to get 500 to 700 bases per one reaction, 

15 which means that about 1 kb fragments sharing overlapping bases are needed for 
sequencing. For this purpose, 27 subclones were constructed. 

Sequencing of the flanked Bglll fragments consisting of about 16000 bp revealed 15 
complete ORFs. The sequence analysis can be made by any computer based program, 
20 such as GCG (Madison, Wisconsin, USA) package. According to the present invention 
the putative gene functions as deduced from the sequence homology of those available 
in the libraries are 





aminotransferase (snogl), not completed 


1. 


dTDP-glucose synthase (snogJ) 


25 2. 


aminomethyi transferase (snogA) 


3. 


polyketide cyclase, (shobM) 


4. 


a gene of deoxyhexose pathway, unknown (snogN) 


5. 


hydroxylase, (snoaG) 


6. 


dTDP-4-dehydrorhamnose reductase (snogC) 


30 7. 


dTDP-glucose 4,6-dehydratase (snogK) 


8. 


NAME cyclase (snoaL) 


9. 


unknown (snoK) 
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10. 


glycosyl transferase, GTF (snogD) 


11. 


unknown (snoW) 


12. 


glycocyl transferase, GTF (snogE) 


13. 


unknown (s/ioL) 


14. 


unknown (snoO) 


15. 


C-7 ketoreductase (snoaF) 




unknown (snoN), not completed 



Gene designations: g means that the gene involved in biosynthesis of the glycosidic 
proportion including glycosyl transferases, whereas a points out that the gene is needed 
for the formation of the aglycone moiety. 

Considering the proposed biosynthetic pathway for nogalamycin shown in Fig 3. we are 
able to cause several changes for the structures of antibiotics by the genes identified, 
including snoaU responsible for the cyclization of the fourth ring of the aglycone 
moiety while determining the stereochemistry of the anthracyclinone, and the genes 
affecting the formation of nogalamine and nogalose (snogJ, snogK, snogN, snogC, 
snogA), and, in addition, the genes responsible for joining the sugar residues to the 
aglycone moiety (snogD and snogE). 

These genes could be separately inserted in a vector using suitable restriction sites, or 
by amplifying the genes by PGR. The fragments may contain an intrinsic promoter, or a 
promoter may be separately cloned. It is advantageous to use a vector carrying a 
promoter to allow expression of the genes in a Streptomyces strain. The plasmid 
pIJE486 contains a promoter ermE for erythromycin resistance gene, allowing constitut- 
ive expression of the genes inserted in a correct orientation. Special attention is drawn 
to the gene encoding a cyclase for the aliphatic ring, but any gene of said cluster may 
be expressed in Streptomyces hosts. The said cyclase converts the stereochemistry at C9 
of auramycinone in TK24, if inserted into the plasmid possessing the other genes for 
auramycinone biosynthesis, except the cyclase responsible for the typical 
stereochemistry of anthracyclines. 
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Streptomyces strains, in particular S. galilaeus, carrying the recombinant plasmids are 
cultivated in media wherein antibiotics are produced. The hybrid compounds are 
extracted with organic solvents from the culture broth, and the compounds are separated 
and purified using chromatographic techniques. 

According to this invention S. galilaeus H039 carrying the plasmid pSY42 and desig- 
nated as H039/pSY42 produces aklavinone-4*-epi-2-deoxyfucose in El medium 
supplemented with thiostrepton to give selection pressure for the plasmid containing 
strains. 



S. lividans TK24 carrying the plasmid pSY15c containing the genes for the nogalamycin 
chromophor and the genes for a cyclase (snoaL) and a ketoreductase (snoaF), was 
cultivated in El medium supplemented with thiostrepton. The compound 9-epi- 
auramycinone was produced, and this structure is now called nogalamycinone. Any 
15 DNA fragment of the invention subcloned from a 17 kb nogalamycin biosynthesis 

region can be inserted in a vector replicating in Streptomyces, and the products may be 
produced by fermentation of the plasmid containing strains. 

Brief description of the drawings 



10 



20 



Fig. 



1 



shows the structures of nogalamycin, daunomycin and aclacinomycin. 



Fig. 



2 



is a diagram of the gene cluster (SnoS) of the invention for nogalamycin 
biosynthesis. 



25 



Fig. 



3 



describes the proposed biosynthesis pathway for nogalamycin. 



Fig. 



4 



shows a diagram of the plasmid pSY15c. The genes snoaL (aL) and snoaF 
(aF) shown black are inserted in the plasmid pSY15 to give pSY15c. aL 



represents a cyclase snoaL and aF is for C-7 ketoreductase snoaF, pSY15 
(WO 96/10581) generates the production of a tricyclic intermediate for nogala- 
mycin biosynthesis in S. lividans. The abbreviations al, a2 and a3 refer to the 
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genes snoal, snoal and snoa3 y respectively, for minimal PKS. rA is the snoiA 
gene for an activator, aB is the snoaB gene for oxygenase, aC is the snoaC 
gene for methylase, aD is the snoaD gene for polyketide ketoreductase and aE 
is the snoaE for aromatase. gF (the snogF gene) and gG (the snogG gene) 
involved in the deoxyhexose pathway are not functional in the construct, aph 
is an aminoglycoside phosphotransferase gene, and tsr is a thiostreptone 
resistance gene. 

Examples to further illustrate the invention are given hereafter. 

EXPERIMENTAL 

Materials used 

Restriction enzymes used were purchased from Promega (Madison, Wisconsin, USA) or 
Boehringer Mannheim (Germany), and alkaline phosphatase from Boehringer Mann- 
heim, and used according to the manufacturers 1 instructions. Proteinase K was purchased 
from Promega and lysozyme from Sigma (St. Louis, USA). Hybond™-N nylon 
membranes used in hybridization were purchased from Amersham (Buckinghamshire, 
England), DIG DNA Labelling Kit and DIG Luminescent Detection Kit from 
Boehringer Mannheim. Qiaquick Gel Extraction Kit from Qiagen (Hilden, Germany) 
was used for isolating DNA from agarose. 

Bacterial strains and their use 

- Escherichia coli XL1 Blue MRF (Stratagene, La Jolia, CA) was used for cloning. 

- Streptomyces nogalater ATCC 27451; the gene cluster of nogalamycin biosynthesis 
was cloned from this strain. 

The host strains to express the genes cloned were: 

- Streptomyces lividans TK24, also used as a primary host to clone DNA propagated in 
E. coli. The strain was provided by prof. Sir David Hopwood, John Innes Centre, UK. 

- Streptomyces galilaeus H039, produces aklavinone-rhodinose-rhodinose-rhodinose 

- Streptomyces galilaeus H026, produces aclacinomycin N, ACMN, (aklavinone- 
rhodosamine-2-deoxyfucose-rhodinose) 
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- Streptomyces galilaeus H063, produces aklavinone 

- Streptomyces galilaeus H075, produces aklavinone-rhodosamine-2-deoxyfucose-2- 
deoxyfucose 

The detailed description of the mutants H039 and H026 is given in Ylihonko et ah 
(1994) and of H075 in the FI patent application No. 981062 (Ylihonko et ah, 1998). 
H063 has not been described in the literature but it was obtained by NTG mutagenesis 
of 5. galilaeus, and selected to be used as the host strain in the hybrid compound 
production, as it accumulates aklavinone without any sugar residues. 

Plasmids 

E. coli - Streptomyces shuttle cosmid pFD666 (ATCC 77286) was used for cloning the 
chromosomal DNA. £ coli cloning vectors pSL1190 (Pharmacia) and pUC19 were used 
for preparing the subclones. 

pIJ486 is a high copy plasmid vector provided by prof. Sir David Hopwood, John Innes 
Centre, UK (Ward et ah y 1986) 

pIJE486 is a vector containing ermE gene in the polylinker of pIJ486 (Bibb et ah, 
1985). 

pSY15 is a pIJ486 based plasmid construct, wherein the genes of polyketide pathway 
for nogalamycin biosynthesis were cloned (Ylihonko et ah, 1996a). 

Nutrient media and solutions 

For cultivation of S. nogalater for total DNA isolation TSB medium was used. 
Lysozyme solution (03 M sucrose, 25 mM Tris, pH 8 and 25 mM EDTA pH 8) was 
used in isolation of total DNA. TE buffer (10 mM Tris, pH 8.0 and ImM EDTA) was 
used to dissolve the DNA. 

TRYPTONE-SOYA BROTH (TSB) 

Per litre: Oxoid Try pt one Soya Broth powder 30 g. 
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ISP4 



Bacto ISP-medium 4, Difco; 37 g/1. 



El Per litre in tap water: 



10 



5 



glucose 
soluble starch 
Farmainedia 
yeast extract 
K 2 HP0 4 «3H 2 0 
MgSO 4 «7H 2 0 
NaCl 
CaCQ 3 



20 g 
20 g 

5 g 
2-5 g 
13 g 

1 g 

3 g 
3 g 



pH adjusted to 7.4 before autoclaving 



15 



General methods 

NMR data was collected with a JEOL JNM-GX 400 spectrometer at the ambient 
temperature. *H and I3 C NMR samples were internally referenced to TMS. 

20 

The anthracycline metabolites were detected by HPLC (LaChrom, Merck Hitachi, pump 
L-7100, detector L-7400 and integrator D-7500) using a LiChroCART RP-18 column 
(4.6x250mm). Acetonitrile:potassium hydrogen phosphate buffer (60 mM, pH 3,0 
adjusted with citric acid) was used as the mobile phase. Gradient system starting from 
25 65% to 30% of potassium dihydrogen phosphate buffer was used to separate the 
compounds. The flow rate was 1 ml/min and the detection was effected at 430 nm. 

ISP4 plates supplemented with thiostrepton (50 ^g/ml) were used to maintain the 
plasmid carrying cultures. 



Example 1. Cloning the gene cluster for nogalamycin biosynthesis 
1*1 Cosmid library 

For the isolation of total DNA, Streptomyces nogalater (ATCC 27451) was grown for 
35 three days in 50 ml of TSB medium supplemented with 0.5% of glycine. The cells 

were harvested by centrifuging for 15 min at 3900 x g in 12 ml Falcon tubes, and the 



30 



WO 00/24775 



PCT/FI99/00870 



12 

cells were stored at -20°C Cells from a 12 ml sample of the culture were used to 
isolate the DNA. 5 ml of lysozyme solution containing 5 rng of lysozyme/ml was added 
onto the cells, incubated for 20 min at 37°C 500 /A of 10% SDS containing 0.7 mg of 
proteinase K was added onto the cells and incubated for 80 min at 62°C, another 500 /ul 
5 of 10% SDS containing 0.7 mg of proteinase K was added, and incubation was 

continued for 60 min. The sample was chilled on ice and 600 jA of 3M NaAc, pH 5.8 
were added, and the mixture was extracted with equilibrated phenol (Sigma). The 
phases were separated by centrifuging at 1400 x g for 10 min. The DNA was precipi- 
tated from the water phase with equal volume of isopropanol to spool with a glass rod, 
10 and washed by dipping to 70% ethanol, air dried and dissolved in 500 fA of TE-buffer. 

The chromosomal DNA was partially digested with Sau3AL The DNA fragments were 
separated by agarose gel electrophoresis, and the fragments of 30 to 50 kb were cut 
from the 0.3% low gelling temperature SeaPlaque© agarose. The DNA bands were 

15 isolated from the gel by heating to 65°Q extracting with equal volume of equilibrated 
phenol, and the phases were separated by centrifuging for 15 min at 2500 x g. The 
phenol phase was extracted with TE buffer, centrifuged and the water phases were 
pooled. The DNA was precipitated by adding 0.1 volumes of NaAc, pH 5.8 and 2 
volumes of ethanol at -2Q°C for 30 min, centrifuged for 30 min at 15 000 rpm in 

20 Sorvall RC5C centrifuge using SS-34 rotor with adapters for 10 ml tubes. The pellet 
was air dried and dissolved in 20 jA of TE buffer. The isolated fragments were ligated 
to pFD666 cosmid vector digested with BamHl and dephosphorylated. The DNA was 
packed into phage particles, and infected to E, coli using Gigapack® III XL Packing 
Extract Kit according to the manufacturer's instructions. 

25 

1.2 Identification of the clones by hybridization 

The infected cells were grown on LB plates containing 50 fxg/m\ kanamycin and 
transferred to Hybond™-N nylon membranes (Amersham). The membranes were 
handled according to the protocol described in Boehringer Mannheim's manual "The 
30 DIG System User's Guide for Filter Hybridization '\ The probe used to screen the 

colonies for an expanded nogalamycin gene cluster was a L07 kb Sacl fragment from 
the cluster described earlier (Torkkeli et ai, 1997). The plasmid carrying the probe was 
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digested with Sad, and the fragment was separated from the vector by agarose gel 
electrophoresis and isolated from the gel using Qiaquick Gel Extraction Kit (Qiagen). 
The probe was labelled by digoxygenin using random prime labelling system according 
to Boehringer Mannheim's manual "The DIG System User's Guide for Filter Hybrid- 
ization". 5000 colonies were screened by hybridization at 70°C using the probe 
described. Positive colonies were detected using DIG Luminescent Detection Kit 
(Boehringer Mannheim). Seven colonies gave a positive signal. Cosmids from the 
positive clones were isolated from a 5ml culture by alkaline lysis method. Restriction 
analysis showed that the cloned fragments overlapped each other representing at least 60 
kb of the continuous DNA. The positive clones obtained were designated as pFDSraol to 
pFDSno7. 

1.3. Subcloning the fragments for sequencing 

Clone No. 5, designated as pFD5no5, was digested with Bglll, and for subcloning two 
fragments of about 10 kb and 7 kb were isolated and ligated to pSL1190 digested with 
Bglll and dephosphorylated. The plasmids obtained were named as pSn42 and pSn43, 
respectively. These two fragments cover the DNA region flanked to the previously 
characterized area of nogalamycin biosynthesis cluster. To determine the nucleotide 
sequence of the whole 17 kb region cloned in pSn42 and pSn43 the convenient 
restriction sites were used to subclone the fragments to the vector pUC19 or pSL1190 
giving 16 subclones from the insert of pSn42 and 11 subclones of pSn43. 

E. coli XL1 Blue MRF' cells were cultivated overnight at 37 °C in 5 ml of LB- 
medium supplemented with 50 [xg/ml of ampicillin. To isolate plasmids for sequencing 
reactions Wizard Plus Minipreps DNA Purification System kit of Promega, or Biometra 
silica spin plasmid miniprep kit of Biomedizinische Analytik Gmbh were used accord- 
ing to the manufacturers' instructions. 

DNA sequencing was performed using the automatic ABI DNA sequenator (Perkin- 
Elmer) according to the manufacturer's instructions. 
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1.4 Sequence analysis and the deduced functions of the genes 

Sequence analyses were effected using the GCG sequence analysis software package 
(Version 8; Genetics Computer Group, Madison, Wisconsin, USA). The translation table 
was modified to accept also GTG as a start codon. Codon usage was analysed using 
5 published data (Wright and Bibb 1992). 

According to the CODONPREFERENCE program the sequenced DNA fragment (SEQ 
ID NO:l) contained 15 complete open reading frames (ORFs) ? and the 5' end of other 
two ORFs in the both ends of the fragment according to the invention. The functions of 
10 the genes were concluded by comparing the amino acid sequences translated from their 
base sequences to the known protein sequences in the data banks. The results are shown 
in Table 1. The positions given refer to the appended sequence listing. The amino acid 
sequences of the peptides are given in SEQ ID NO:2 to SEQ ID NO:18. 
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Table 1 



Gene 


Position 


Amino acids 
(SEQ ID NO) 


Deduced function 


Remarks 


snogl 


-1027 
compl 


>342 (2) 


aminotransferase 


5* end 


STZOgJ 


1192-2073 


293 (3) 


dTDP-glucose synthase 




snogA 


2106-2822 
compl 


238 (4) 


aminornethyl transferase 




snoaM 


2826-3800 
compl 


324 (5) 


a polyketide cyclase 




snogN 


3799-5025 


408 (6) 


dnrQ homology (Otten et 
aiy 1995), unknown 






5088-6356 


422 (7) 


hydroxylase 




snogC 


6334-7209 
roTYinl 


291 (8) 


dTDP-4-dehydrorhamnose 
reductase 




snogK 


7245-8297 
compl 


350 (9) 


dTDP-glucose-4,6-de- 
hydratase 






8537-8941 


134 (10) 


NAME cyclase (nogalonic 
acid methyl ester) 




snoK 


8992-9699 


235 (11) 


unknown 




snogD 


9745-10917 
compl 


390 (12) 


glycosyl transferase 






11057- 
11884 


275 (13) 


unknown 




snogE 


11928-* 


>424 (14) 


glycosyl transferase 




sndL 


13335- 

13754 

compl 


139 (15) 


unknown 




snoQ 


13974- 
14441 


155 (16) 


homologous to mtmX of 
mithramycin cluster 




sno&V 


14532- 
15377 


281 (17) 


C-7 ketoreductase analog- 
ous to aklaviketone keto- 
reductase 




snoN 


15450- 


>190 (18) 


unknown 


5' end 



*: nucleotide sequence of about 100 bp, not known 
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1.5 Expression cloning 

The 10 kb Bglll fragment from pFDSnoS was cloned into the plasmid pIJ486 and the 
piasmid obtained was named as pSY42. Correspondingly, the 7 kb Bglll fragment from 
pFDSnoS was cloned into the plasmid pIJE486, and the plasmid pSY43 was obtained. 
Plasmid pSY42 was introduced into S. lividans strain TK24 by protoplast transform- 
ation, isolated from it and further introduced into 5. galilaeus mutant H039, and after 
propagation in H039, transferred to other S. galilaeus mutants blocked in the deoxy- 
hexose pathway for characteristic sugars of aclacinomycins (H075, H026, and H063). 
El medium was used for anthracycline production, and the products were extracted 
from the culture with toluenermethanol (1:1) at pH 7. Anthracycline metabolites were 
analyzed by HPLC The products of the mutants H039, H026, H063 and H075 carrying 
pSY42 differed from those obtained by the mutants without the plasmid. 

According to the sequence analysis pSY42 contained a cyclase designated as NAMEC 
(nogalonic acid methyl ester cyclase), and in pSY43 a ketoreductase gene was ident- 
ified. Expression constructions were prepared which contained all the genes needed for 
the formation of nogalamycin aglycone. A 1.4 kb BamHl-Sacl fragment from pSY42 
(containing NAMEC) and a 1.1 kb Mlul-Kpnl fragment from pSY43 carrying the gene 
for a ketoreductase of C-7 keto group were ligated to pSY15 linearized by Sad, to 
form the plasmid pSY15c (Fig. 4). Plasmid pSY15c was introduced into S. lividans 
TK24, and the strain TK24/pSY15c was cultivated in El medium supplemented with 
thiostrepton. An aglycone compound was produced, and this structure is now called 
nogalamycinone . 

Example 2. Compounds generated by the snoS-cluster 

2.1 Production and purification of the products derived from H039/pSY42 and 
TK24/pSY15c 

The seed culture, 180 ml of El culture of the plasmid containing strain, H039/pSY42 or 
TK24/pSY15c ? was obtained by cultivating the strain in three 250 ml Erlenmeyer flasks 
containing 60 ml of El medium supplemented with thiostrepton (5 ^g/ml) for four days 
at 30°Q 330 rpm. The combined culture broths (180 ml) were used to inoculate 13 1 of 
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El medium in a fermentor (Biostat E). Fermentation was carried out for seven days at 
28°C (330 rpm, aeration: 450 1/min). 

The cells were harvested by centrifuging. 2.6 1 of methanol was used to break the 
5 bacterial cells and to extract anthracycline metabolites accumulated. The anthracycline 
metabolites were extracted using 2 1 of dichloromethane at pH 6. The organic layer was 
evaporated to dryness. The viscous residue was flashed through a polyamide (11) 
column using watenmethanol from 1:9 to 0:10 as the eluent. Pooled fractions containing 
the compounds were further purified on a Merck-Hitachi HPLC using preparative 
10 reversed phase column (LichroCART RP-18, 5 /an) with mobile phase acetonitrileil % 
AcOH in water (1:1). Evaporation of acetonitrile gave pure products as yellow powders 
dried under vacuum. 

2.2 Structural elucidation of the compounds derived from H039/pSY42 and 
15 from TK24/pSY15c 

NMR analysis included NON, BMC, NOE } DEPT and HMBC techniques. Protons were 
assignated using NOESY and 2D pTOCSY techniques and carbons using DEPT and 
HMBC techniques. 

20 As deduced from the data given in Tables 2 and 3, the structures revealed were 
akiavinone-4*-epi-2-deoxyfucose from the culture of H039/pSY42, and 9-epi- 
auramycinone (=nogalamycinone) from the culture of TK24/pSY15c. The chemical 
structures of the compounds are shown below in Formula I and Formula II, respectively. 



30 




(I) 



OH 
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OH O OH OH 



Deposited microorganisms 

The following microorganisms were deposited according to the Budapest Treaty at 
Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSMZ), 
Mascheroder Weg lb, D-38124 Braunschweig, Germany. 

Microorganism Accession number Date of deposit 

S. lividans TK24/pSY42 

carrying the piasmid pSY42 DSM 12451 14 October 1998 

5. lividans TK24/pSY43 

carrying the piasmid pSY43 DSM 12452 14 October 1998 
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Table 2, *H and 13 C assignations of the compound akIavinone-4'-epi-2- 
deoxyfucose (Formula I). 



1 


7.74, 1H, dd, 7.5, 1.3 


120.1 


2 


7.68, 1H, dd,. 8.4, 7.5 


137.3 


3 


7.27, 1H, dd, 8.3, 1.3 


124.6 


4 


- 


161.9 


4-OH 


11.70, 1H, s 


— 


4a 


- 


115.4 


5 


— 


192.3 


5a 


— 


114.4 


6 


— 


162.4 


6-OH 


12.46, 1H, s 


— 


6a 




130.9 


7 


5.18, 1H, dd, 4.3, 3.1 


71.3 


8A 


2.51, 1H, dd, 15.0, 4.3 


33.9 


8B 


2.32, 1H, dd, 15.0, 3.1 


— 


9 


— 


72.1 


9-OH 


4.58, 1H, s 


— 


10 


4.02, 1H, s 


56.9 


10a 


- 


142.4 


11 


7.40, 1H, s 


120.8 


11a 


- 


133.1 


12 




180.7 


12a 




132.6 


13A 


1.73, 1H, dq, 14.2, 7.4 


32.0 


13B 


1.51, 1H, dq, 14.2, 7.4 




14 


1.10, 3H, t, 7.4 


6.7 ~ 


15 




171.1 


16 


3.69, 3H, s 


52.5 


r 


5.41, 1H, d, 3.5 


101.7 


2'a 


1.75, 1H, ddd, 12.8, 11.2, 3.4 


37.7 


2'e 


2.19, 1H, dd, 12.8, 5.3 




3' 


3.71, 1H, ddd, 12.0, 9.0, 5.3 


69.0 


4' 


3.14, 1H, dd, 9.1, 9.0 


78.1 


5' 


3.88, 1H, dq, 9.1, 6.2 


68.8 


6' 


1.36, 3H, d, 6.2 


17.6 
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Table 3, *H and 13 C assignations of 9-epi-auramycinone (Formula II). 
"Site r H *C 



1 7.76, 1H, dd, 7.5, 1.2 119.8 

2 7.67, 1H, dd, 8.3, 7, 5 137.4 

3 7.28, 1H, dd, 8.3, 1.2 124.8 

4 - 162.5 
4-OH 11.86, 1H, s 

4a - 115.6 

5 - 192.7 
5a - 114.6 

6 - 160.9 
6-OH 12.76, 1H, s 

6a - 134.1 

7 5.40, 1H, t, 7.0 64.0 
8A 2.66, 1H, dd, 13.9, 7.0 40.9 
8B 1-89, 1H, dd, 13.9, 7.1 

9 - 70.5 
9-OH 3.49, 1H, brs 

10 3.93, 1H, d, 0.8 56.0 
10a - 142.1 

11 7.51, 1H, d, 0.8 120.1 
11a - 133.3 

12 - 180.9 
12a - 132.1 

13 1.44, 3H, s 28.7 

14 - 173.0 

15 3.90, 3H, s 52.6 
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Claims 

1. Isolated and purified DNA fragment, which is the gene cluster for the anthrarycline 
biosynthetic pathway of the bacterium Streptomyces nogalater, being included in a lOkb 
and a 7kb flanked Bglll fragments of S. nogalater genome. 

2. The DNA fragment according to claim 1, comprising the nucleotide sequence given 
in SEQ ID NO:l, or a sequence showing at least 80% homology to said sequence. 

3. A recombinant DNA, which comprises the DNA fragment according to claim 1 or 2, 
cloned in a plasmid replicating in Streptomyces. 

4. The recombinant DNA according to claim 3, which is the plasmid pSY15c, compris- 
ing a 1.4 kb BamHl-Sacl fragment from the plasmid pSY42 and a 1.1 kb Mlul-Kpnl 
fragment from the plasmid pSY43. 

5. Plasmid pSY42, deposited in S. lividans strain TK24/pSY42 with the deposition 
number DSM 12451. 

6. Plasmid pSY43, deposited in 5. lividans strain TK24/pSY43 with the deposition 
number DSM 12452. 

7. A process for the production of hybrid compounds, comprising transferring the DNA 
fragment according to claim 1 or 2 into a Streptomyces host, cultivating the recombinant 
strain obtained, and isolating the compounds produced. 

8. The process according to claim 7, wherein the Streptomyces host is a Streptomyces 
galilaeus host. 

9. The process according to claim 8, wherein the Streptomyces galilaeus host is selected 
from the strains H026, H039, H063 and H075, which are mutant strains of S. galilaeus 
ATCC 31615. 
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10. The process according to claim 8, wherein an anthracycline is produced, which has 
the following formula I 



COOCH 3 



10 




(I) 



Me 
HO 



OH 



11. The process according to claim 8, wherein an anthracyclinone is produced, which 
15 has the following formula II 



20 




(II) 



OH O 



OH OH 



25 



12. A process for the production of hybrid compounds, comprising transferring at least 
one of the genes selected from the group consisting of snogJ, snogA, snoaM, snogN, 
snoaG, snogC, snogK, snoaL, snoK, snogD, sno'W, snogE, snoL, snoO and snoaF into a 
30 Streptomyces host, said genes being derived from the DNA fragment of claim 1 or 2, 
cultivating the recombinant strain obtained, and isolating the compounds produced. 
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13. The process according to claim 12, wherein the gene snoaL encoding NAME 
cyclase is transferred into a Streptomyces host. 

14. The process according to claim 12, wherein at least one of the genes snogD and 
snogE encoding glycosyl transferases is transferred into a Streptomyces host. 

15. The process according to claim 12, wherein at least one of the genes snogJ, s/togN, 
snogC, snog¥L and snogA affecting the formation of nogalamine and nogalose is 
transferred into a Streptomyces host. 



ABSTRACT OF THE DISCLOSURE 
The present invention relates to the gene cluster for 
nogalamycin biosynthesis derived from Streptomyces nogalater, and 
the use of the genes therein to obtain novel hybrid antibiotics 
for drug screening. 
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SEQUENCE LISTING 

<110> Galilaeus Oy 

<120> Gene cluster involved in nogalamycin biosynthesis, and its use in 
production of antibiotics 

<160> 18 

<210> 1 

<211> 16020 

<212> DNA 

<213> Streptomyces nogalater ATCC 27451 
<220> 

<221> misc^f eature 

<222> 3799.. 3800 

<223> "overlapping sequence in the genes snoaM and snogN" 

<22 1> misc^feature 

<222> 6334.. 6356 

<223> "overlapping sequence in the genes s,noaG and snogC" 

<221> misc_feature 

<222> 13201. .13300 

<223> "unknown region" 

<400> 1 



agatctcgtc 


cgccagtgcc 


tcggtgaccg 


gcaacgagcc 


cttggcgtag 


ccgagatggg 
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agaaaccggt 


catggtgtgc 


acgggccagg 


gataactgat 


gttgagggcg 


atgtcgtagg 
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ggcctccagc 
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gtcgcggatg 
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tacacgtagt 
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gcagcagcag 


ccccgtgtcc 
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gcgtgccacc 
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cctcgatgta 
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cgggacagct 
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tgcgccgcag 
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tgtacttcgt 


ccagccggct 


gttgtgcccg 


ggggtttcga 


360 


cgacgtagta 


gcggctctcc 
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agcgcagccg 


ccgcagccgg 


tccgccaccc 


420 


gctcgtcgtc 


ggtgagcacc 


gcgccgccgt 
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gcccagcacc 


ttggtcgggt 
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cagcagacac 


cgggtgcgtg 
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gcgtgaggac 


ggcctccacc 


tgggacgtgt 


ccatcaggta 


gtcctcctcg 


cgcacgtcca 


720 


cgaagacggg 


cgtggcaccg 


gccgagtcga 


tcgcgacgac 


cgtgggcgcg 


gcggtgttgg 


780 


acacggtgac 
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gcccctccgg 
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agacggcccg 
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aggtgcggtc 
atcattctcg 
cagcttctcc 
gccggcgtca 
ctgttcggcg 
aggggtatcg 
ctggcgctgg 
gccgaggaat 
ggagtcggcg 
ccccgctcca 
gcccggcggc 
tacatggaac 
accggcacac 
cagggcatca 
caggcctgct 
gccatcgcgg 
cggccttacc 
aggaactcgg 
agggtgagca 
cgcacctcca 
cggccctcac 
caccagggct 
gccgcccgca 
gcgtcgaaac 
ggcagccgcc 
gcgaacagcc 
gaacgcgccc 
cagctctttc 
acgcgtcagt 
accfctgaccg 
gccagccgct 
accgatcggt 
ctgaaggcat 
ccgtcgagac 



gttatcatcg actccgtctt 
ccgggggtac ggggagcagg 
ccgtcgggga caagccgatg 
cggacatcct catcatcagc 
acggcgcaca gctcggactc 
ccgaggcgtt cctgatcggt 
gcgacaacat attccacggg 
tggacgggtg tgtcctgttc 
aggcgaacgc gtccgggcgg 
accgggccat caccggactc 
tgcgcccctc cgcccgcggc 
gaggccgggc ccggctcgtg 
ccgagtcact cctgcaggcc 
ggatcgcctg catcgaggag 
acgaactggg cgcgcgcctg 
aggagtgcac ggggcgggtg 
cggccccgcg caccccgacg 
ccgggcagcc cgcgtcctcg 
ggtcgatctc cgtgaactcg 
tgcgggtcct gcggccctgc 
cgcgtgccag gtccccggcg 
ccaccacgag cacgccgccc 
tgtccgcgac ggtctccaga 
gcccgctcag ggcgaagtcg 
gttcggccag ggcccgcatc 
cgcggaaggc ctccagatgg 
cgggccgacg ggacctgatc 
cccggctgcg gtagaccatc 
cctcgtccac cagggcgacc 
ggaagcagca gacgcggaac 
cgatctggca gtactcccgc 
cgccggtcgc gcggtaccgg 
cggtcccgat cacccggacc 
cggcgaagtc cgtgaagtag 



2 

ctcattcgga ggttgttcag 
ctccacccga cgactctcgc 
atctactacc cgctctccgt 
acaccgcacg aactcccccg 
cgcctggcct acgccgagca 
gccgaccacg tgggaagcga 
agttcttttc agggggtgct 
ggttatccgg tcaaggatcc 
ctcgtctcca tcgaggagaa 
tatttctacg acaacgaggt 
gaactcgaaa tcaccgacat 
gacctgggcc ggggattcgc 
tcgcagtacg tgtccgccct 
gtggccctcc gcatgggctt 
tccggctccg gctacgggca 
tgagcggccg tgccgggtgg 
aacaaccccc ggccggtcag 
aacgcggcga ggtactcctc 
cgtatcccgg tggcctcgcc 
ctggtggagt gggacacccg 
acgtagccct ccaggaaccg 
ggcaccaggt gcgcggccat 
tacccgatgg agcagaacag 
cgcatgtccc cgggccgcac 
tcgtccgaca gctccaggcc 
gcgccggtgc cgcaggcgac 
tccgcggtga cccgttcggc 
tcgtacacgt ccgccagttc 
gcccgggtcc acccggcgcc 
ccgaaggaga ccggcaggcg 
tcccggccca ccacgtgcgc 
tcgatgatgt ggccgaaggg 
ccgtggtcga gaagcatccg 
cgcggggtgc ccgcgtgccg 



ggtgaaggga 


1200 


ggtgtccaag 


1260 


gctgatgctg 


1320 


aatgcgccgt 


1380 


ggagaaaccc 


1440 


tgccgttgcg 


1500 


gcgcaaggaa 


1560 


ccagcgttat 


1620 


accggtacgc 


1680 


ggtggacatc 


1740 


caaccgtacc 


1800 


ctggctcgac 


1860 


ggaggaacgc 


1920 


catcaacgcc 


1980 


gtacgtgatg 


2040 


gcgaacggcc 


2100 


cccgtcgtcc 


2160 


cctggtgaac 


2220 


gaccaggaac 


2280 


ggccacggtc 


2340 


ctcggggaac 


2400 


cgtgcgcacc 


2460 


gcagaccacg 


2520 


cggcaccccc 


2580 


ctccgtgtgc 


2640 


gtcgagcagc 


2700 


ctcgtccgcc 


2760 


ccggccgtac 


2820 


ggcgccggcg 


2880 


gtcgaggttc 


2940 


gggccacagc 


3000 


cgcgtccagg 


3060 


taccgcgggc 


3120 


ctgggcaccg 


3180 



WO 00/24775 



PCT/FI99/00870 



3 



gtgtgcagca 


gcacgatgtc 


cccgggccgc 


aacgcgcacc 


cggtccgggc 


cagttccttc 


3240 


tccaggcgcg 


cggcgctcac 


ggtgcccgtc 


ggagcgtcgg 


tgaggtccag 


caccaccccg 


3300 


cgcccgaaga 


accactccag 


cggcatctgg 


tcgatgtggc 


gggggacgcc 


gtccccgtac 


3360 


agcgcgcgcg 


aaccatagtg 


cgacggcgcg 


tcgacgtgcg 


tgccggtgtg 


cgtggtcagc 


3420 


gtgatcctgt 


ccagtgacag 


gaactcgccg 


tccggcagtt 


cgtccggaga 


gaactcgaca 


3480 


ccgaagtgct 


cgcgcatctc 


cgcgcacatg 


tgttccgcgc 


cctgccgggg 


cgtgaggacg 


3540 


tcgtgcacca 


ccgggtcggg 


ctcgt.act.gt 


gaggaatcca 


ccggtgacga 


aaggtcgatg 


3600 


agccgcacgc 


gcacctccgg 


gttcgtagac 


gggctcggct 


gacgcagcgc 


gggtacgacg 


3660 


ctgacacgcc 


cctcttgacg 


tggcctggaa 


gctggttcga 


cgggcgggca 


ccgcacgcga 


3720 


cggccggcgc 


cgcaccggcg 


ccgtcccggc 


cgagcgggaa 


tccagggagg 


gtatagcggc 


3780 


gcgccccacg 


ctgccgtcat 


ggtgatgaaa 


ctgacggaca 


gcgagctggg 


gcgtgcgctg 


3840 


ctctcgctgc 


gtggttacca 


gtggctccgc 


ggcatccacc 


acgatcccta 


cgccctgctg 
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ctgcgcgccg 


agagcgacga 


tccggcgcag 


ctcggccggc 


tgctgcgtga 


acgcggccgg 


3960 


ctccaccgca 


gcgacaccgg 


cacctgggtc 


accgcggacc 


atgcgacggc 


ctcccggctg 


4020 


ctcgccgacc 


cgcgcttcgt 


gctgcgccgc 


ccgccggccg 


ggcccgccac 


cggcaccggg 


4080 


gacgtcatgc 


cgtgggaaga 


ggccacgctg 


agcgacctgc 


tgcccctcga 


cgaggcgcgc 


4140 


ctgacgaccg 


accgggcacg 


gtgccgccgg 


ctcggcgcga 


ccgccgcgcg 


gatcgcggcg 


4200 


gacggtcccg 


fccgcgacgcg 


actcgcggac 


ctggccgggg 


cccgagccga 


acaggtgcgc 


4260 


tcaacgggcc 


acttcgacct 


cagggccgac 


tacgccctcc 


cgtacgcggt 


cgagccggcc 


4320 


tgcgcgctgc 


tcggcctgcc 


ggccgggcag 


tgttccctct 


tcggcgcctt 


ctccccggcc 


4380 


gtcctgctcg 


acgcgacggt 


cgtaccgccc 


cgccttccgg 


aggcgcgcgc 


cctgatcgcc 


4440 


tccacggcgg 


aactgaccgc 


cctctggccg 


cggctggccc 


cgagcctgtc 


gaagaccgtc 


4500 


ccggaggacg aagcgccgga cctcttcctg ctgacggccg tgttactcgt 


accggccgtc 


4560 


gtccacctgg tctgcgaggc 


ggtcgccgcc 


ctgtcgcacg 


accccgggca 


ggccgggctg 


4620 


ctcagggacg 


acccggtact 


cgccgcaccg gcggtcgagg 


agacgctgcg 


ccacgcaccg 


4680 


cccgcccgtc 


tgttcaccct 


ccacgcgacc 


ggaccggagc 


gcgtcgcgga 


cgtcgacctc 


4740 


cccgcgggcg 


ccgaggtcgc 


cgtcgtcgtg 


gcggcggcgc 


accgcgatcc 


ctcctggtgc 


4800 


ccggaccccg 


accgcttcga 


cctcaccagg 


aacgagcggc 


atctggcact 


gccgccggat 


4860 


ctgccgctgg 


gggcgctcgc 


cccgctgctg 


cgcgtctgcg 


cgaccgcggc 


cgtcgcggcc 


4920 


ctcgcggccg 


gactcctccc 


gctgcgggcc 


gtcggcccgc 


ccgtacgacg 


gctgcgtgcc 


4980 


ccggtcaccc 


ggtccgtgct 


gcgcttcccc 


gtcgccccgt 


gctgagcagc 


ccctcctcac 


5040 


gtcatccccg 


gcccgccttc 


ccccgcccgc 


aacggaaggg 


actctccatg 


gacaaccgcg 


5100 


agaccgtacg 


accggtgagc 


gtctgccggg tctgcggcgg 


caacgactgg 


caggacgtcg 


5160 


tggacttcgg 


tgacgttccc 


ctcgccaacg 


gcttcctgtc 


cccggccgac 


tcctacgaga 


5220 
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acgagcgccg 

cccacgtggt 

aaatgatcac 

ccccggacag 

gcgaagcggg 

ggcgcaacgg 

ggcgcgacca 

acgtgtcgga 

"bcgaggtgcc 

agcactfcgtc 

gggtgctcga 

acgaggacgg 

agcggggcct 

gcaccgaact 

acggtgctcc 

tggaatacrtg 

taccggtgca 

cctggaacta 

ggttcatcgt 

cccgccgggc 

tgcgacgccg 

cgggcccggc 

ccggaacacc 

gatcccgcgc 

accggcccac 

ggcacgctcc 

gcgcaccacg 

gctgcggccg 

gtcgcccggg 

gcggcgggcc 

ctcggcgccg 

ggcgaacgcg 

caccgcctcg 

ggcgagcatg 



ctacccgctg 
cgaccccgag 
ccagcacatg 
cctcgtcgtg 
gatgcgcacc 
catcgagacc 
cgggcaggcg 
catcgcggcc 
gtacgttctg 
gtacttcacc 
cgtggagcgg 
cccctggccc 
ctacgacgac 
gccggaactg 
ggccaagggc 
caccgacacc 
cgctcccgag 
cgccacggag 
gcccatcccc 
agcagctgac 
accagccgcc 
cggtcggccg 
tcccgggcca 
gcccggtctg 
gtcggctgcc 
agcatcgtgc 
gtgcccgtat 
tacaccgtgc 
aagacgtagt 
agcagccggg 
tccacgtccg 
gcgtccaccg 
gcggcgggcc 
ccttctcctc 



ggcgtcctgt 

gtgctgtacc 

cggcacatca 

gagctgggca 

ctgggcgtgg 

ttccccgact 

cggctcgtgc 

ggcgtacgcg 

gacctgctgg 

atgcggtcct 

tfccggcgtgc 

gaacgtccct 

gccacctacc 

ctgcgctccc 

aacaccat.cc 

accgagctga 

cacgccaagg 

atcctcgaca 

cgcccgtcga 

gcatcgcctc 

agcggtcgtg 

tcgccaccgg 

gctcgtacca 

gcggcgtgcg 

cccactggtc 

gcacgaagct 

ccggcagcag 

gcgggcccgg 

cggtcgagac 

gcccgccgcc 

tgaaggcggc 

cccgcccgtc 

ggctcctgcc 

cggtgaccag 



cctgccgcgc 

gcgactacgc 

ccgcgctgtg 

gcaataccgg 

accccgcgcg 

tcttctccca 

tgggacggca 

aactcctgtc 

agaaggtcgc 

tcgtcaccct 

acggcggatc 

ccgtccccga 

gcacgttcgc 

tcgtggccca 

tcacggtgtg 

agcagggcag 

aacacatccc 

aggagacggc 

tcctcacgtc 

gcgcagggct 

cccgagcacc 

gcgcacccgt 

ggtggccgcc 

ggccagcgtc 

gttgacgacg 

gcggccctgc 

cgacagcacg 

agcgtccgac 

gtggatcagc 

gttgacgcgc 

gcagttgacc 

ggtgatgtcc 

ggtctccgcc 

cacgcgcatc 



ctgccggctg 

ctacaccacc 

ccgcacccgt 

ccgtcagctc 

caacctcacg 

cgacgtggcc 

tgtcttcgcc 

tcccgacggg 

gttcgacacc 

cttcgcgcgc 

ggtcctcgtc 

actgctgcgc 

gcagcggatc 

gggcaagcgc 

cgggctcggc 

ggtgctgccc 

cgactactac 

cttccgggac 

cc cgtcaggt 

gcacgccagt 

gtgcacgccg 

tccgggtccg 

ccggcgttgg- 

accagcagcc 

tcgacatggc 

ccgccgtaga 

gcccgttccc 

tcgccgtaag 

cgtacgccgt 

atcgcctccg 

accacccgcg 

agcgcgcgcc 

agggccgcgg 

ccgctcaccg 



atgagcctga 

cccgactccg 

ttcgagcttc 

atggccttcc 

gacgtcgccc 

cgcaccatcc 

cacatcgacg 

gtgttcgcga 

atctaccacg 

cacgggctgc 

ttcgtgggcc 

gtggaacggc 

gagcgggtgc 

atcgtcggct 

ctgaaggagc 

ggcacccaca 

ctgttgctcg 

aacggcggcc 

tcctgaggcg 

cgcggggcgg 

gccggggcgc 

cgcccgccag 

tggcgtggaa 

gggccacgtc 

cgtcgtccgg 

gccacgccgt 

cggccagttt 

ggctgcgggt 

ggcgcgcaca 

cccaccgcga 

gccggtgcgc 

gcccgagtac 

tcaggtgccg 

gaccccgggg 



5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 
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acgacggtgg acgtaccgcc cggcgccgtg actccccgct tgagcggctc ccaccaggac 
cggttctcgc ggtaccactg gaccgtcgag cgcagccccg aggagaactc ccgcgccgga 
cggtagccca gttcctcacg ggccctgccc cagtccaggc tgtaacgcag gtcgtgcccc 
ttgcggtcgg gcacgtgccg gacgctgctc cagtccgccc cgcacagctc cagcaacata 
cccaccagct cccggttgga gagctcccgg ccgccgccga tgtggtacac accgccgggc 
cggcccgcgg tgcgcaccag gtccacgccc cggcagtggt cctccacgtg cagccactcc 
cgcacgttcc gcccgtcccc gtacagcggc accggcagcc cgtccaacaa gttggtgacg 
aagcgcggga tgagcttctc cgggtgctga cgcgggccgt agttgttgga acagcgggtc 
acccgcacgt ccaggccgtg cgtgcggtgg caggcgaacg ccatcaggtc ggccgacgcc 
ttggaggcgg cgtacgggga gttggggctc agcgggtgct cctccggcca ggaaccggac 
gcgatggagc cgtagacctc gtccgtggac accaggacga agggctccac gccgtggcgc 
agcgcggcgt ccagcagccg ctgggtgccg acgacgttgg tcagcacgaa gtcgtcggcc 
gcgcggatgg accggtcgac gtgcgactcc gcggcgaagt ggacgacctg gtcgctgtgt 
gccatcagct cgtcgaccag ctcggcgtcg aggatgtcgc cccgcacgaa gcgcagccgg 
tcaccgcgta ccgcgtccag gttcgtgagg ttgcccgcgt acgtcagttt gtcgaggacg 
gtgacgcgta ccgccggggc ccccgctccg ggggcccggt tctccagcag catgcgcaca 
taggccgagc cgatgaaacc gaccgcgccg gtgaccagga tgttcacgtc cgtcgtcgcg 
gaggtgtgcg acgccatggg ttccctccat ccgtcgggtg ccgtggggcg gagtgcgccc 
cctcgaccca gcgtcggggg cggccgtgga ggagcggttg agcttcggcg cagcggcggc 
tcgaccggcg gcggccggcg tcgccggact ccaacggttc tcgacggaac gaccaacggc 
cctggcgaga ctgcccggac agcccggccg agagagggag gacccgttga gccgtcagac 
agagatcgtc cgccggatgg tgagcgcctt caacaccggc aggaccgacg acgtggacga 
gtacatccac cccgactacc tcaatccggc caccttggaa cacggcatcc acaccgggcc 
caaggcgttc gcccagctgg tcggctgggt gcgggcgacg ttctccgagg aagcccgcct 
ggaggaggtg cggatcgagg agcgcggccc gtgggtcaag gcctacctcg tgctctacgg 
ccgccacgtc ggccggcttg tcggtatgcc gcccaccgac cggcgcttct ccggtgaaca 
ggtgcacctg atgcgcatcg tcgacgggaa gatccgcgac caccgggact ggcccgactt 
ccaggggacg ctgcgccagc tcggcgaccc gtggcccgac gacgagggct ggcgtccgtg 
accgtccctg aaaccgcacc cgacgagaca tcagaccagg aaggatggct catgccggat 
cccggcggcc cgaccacggc cgagaacctg tcgaaggagg ctgtccgctt ctaccgcgag 
cagggttacg tgcacatccc gcgcgtcctg tcggagacgg aggtgaccgc cttccgggcc 
gcctgtgagg aggtcctgga gaaggagggc cgcgagatct ccggcatcgc cctgcggctg 
gccggcgcgc ccctgcgggt ctacagcagc gacatcctgg tcaaggagcc caagcgcacc 
ctgcccaccc tggtccacga cgacgagacg ggactgccgc tgaacgagct gagtgccacg 



7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 
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ctgacggcct 


ggatcgcgct 


gacggacgta 


cccgtcgaac 


gcggctgcat 


gagctacgtg 


9360 


ccgggctccc 


atctcagggc 


ccgcgaggac 


cggcaggagc 


acatgaccag 


cttcgccgag 


9420 


tfcccgggacc 


tcgcggacgt 


gtggcccgat 


tacccgtggc 


agccgcgcgt 


cgccgtgccc 


9480 


gtccgcgccg 


gagacgtcgt 


gttccaccat 


tgccgtaccg 


tccacatggc 


cgaagccaac 


9540 


accagcgact 


cggtccgcat 


ggcgcatggc 


gtcgtctaca 


tggacgcgga 


cgccacctac 


9600 


cggccgggcg 


tccaggacgg 


ccacctgtcc 


cgcctgtcgc 


cgggagatcc 


actcgaaggc 


9660 


gagctgttcc 


ccctggtcac 


ggcaggcaca 


cggcagtgag gtccgccgtt 


cccggcggtc 


9720 


gcgggaccgc 


cggggacggc 


accgtcagcc 


ggccagcgcc 


acgagcttgg 


cggccgtctc 


9780 


ggccggcggc 


ggcatctcgc 


tcatctcctg 


ccgcacccgc 


agggccgcct 


cccgcaaccc 


9840 


cgcgtcgtcc 


agcagccgtc 


ggcactgctc 


ggcacccagc 


gatcccgcct 


cggcatcgaa 


9900 


cccgatgccc 


agcccggtca 


gcacatcgcg gttggtgtcc 


tggtaggagc 


cgtgcgggat 


9960 


gacgcactgc 


gggacgccgg 


cggccagggc 


cgtcagcagt 


gtgccgctgc 


ccccgtgatg 


10020 


gatgatcgcg 


tcgcacgtct 


ccagcagcgc 


gcccagcgga 


atccactcca 


ccaccggtac 


10080 


gttcgcgggc 


agttcaccga 


gcagggccag 


gtcgccgccg 


cccagggtca 


gcacgaactc 


10140 


cgcgtccacg 


tccgccactt 


cggagaacag 


c ggggccagc 


ttggcgatgc 


cgcccgacag 


10200 


cgcgtcgatg 


gagcccagcg 


tcaccgcgat 


acgccgccgg 


ccggccgcgg 


gcggcagcca 


10260 


gtccggcagc 


accgctccgc 


cgttgtaggg 


gacgtaccgc 


atcggccagg 


cacccgggga 


10320 


gcgccggtcc 


tccggcagca 


gcgcctccac 


gctcggcggt 


gtcgtcgtca 


gccgcacgga 


10380 


accggtcggc 


tcgccggtga 


cgccgtggcg 


ctcgtagtcc 


ttggacatcg 


cccgccggat 


10440 


gagcgcgccg 


agccccggct 


cgctgtccgc 


gggacccagc 


ggcagctcta 


cgcacggcag 


10500 


ttgcagcgct 


gccgccgtca 


gcgggcccgc 


gccctgtgtc 


ggagtgtgca 


cgacgaggtc 


10560 


gggccgccag 


ctccgcgccg 


tccgcagcgc 


cccgtcgacg gccaccgccg 


atacccgggc 


10620 


gaacatctcg 


gcgaagaagc 


cctcgcccag 


cccctcggag 


tgcatcgggt 


cggtgacgtc 


10680 


ggtgtcgtcg 


ggcacgaaca 


gcttcgcgta 


gttcacgccg 


ggcgacacgt 


ccacggcgca 


10740 


cagcccggcc 


tccgcgacgg 


cgcggatgtc 


gccccccgtg gcgtagcgga 


cctcgtggcc 


10800 


gagagcgcgc 


agcgcctgtg 


ccagcggcac 


cgtcggcagg 


atgtggctga 


gcccgggtga 


10860 


agtgatgaac 


aacgcacgca 


tgatgccccc 


tgttcgacat 


gaacctggaa 


cacgcatcct 


10920 


gacggcgcct 


tctgttgctc 


cggtcgacgc 


ccggtcgaca 


ggccctcgta 


cagcccgccg 


10980 


ggggccggtc 


cggccacgac 


gcaggctcca 


gcggacgtcg 


acggcgggga 


cgcagcgtgg 


11040 


tcgccgggag 


gcatcgatga 


cagtattggt 


aaccggagcc 


acaggaaacg 


tcggccggca 


11100 


cgtcgtcacc 


gggctactgg 


ccgccggccg 


ccgggtgcgg 


gcgctgaccc 


gcacacccga 


11160 


ccggtccggc 


ctgcccggcg 


gcgcggagat 


cacaggcggc 


gacctgaccc 


gcccggagac 


11220 


ctacgagcgg atgctggacg gtgtcgaagc 


cgtctacctg ttccccgtcc 


cggagaccgc 


11280 


cgcggcgttc 


gccggggccg 


cgcgacgggc 


cggtgtccgg 


cggatcgtgg 


tgctctcctc 


11340 
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ggactccgtc 


accgacggca 


ccgacaccgg 


aggacaccgg 


cgcgtggaac 


tggccgtgga 


11400 


ggacacgggg 


ctcgagtgga 


cccatgtgcg 


ccccggcgag 


ttcgcgctca 


acaaggtcac 


11460 


cctgtgggcg 


ccgtcgatcc 


gcgcggaggg 


cgtcgtccgg 


tccgcgtatc 


cggacgcccg 


11520 


ggtggccccg 


gtgcacgagg 


ccgacgtcgc 


ggccgtcgcg 


gtgaccgcgc tgctgaagga 


11580 


ggggcacgcc 


ggccgcgcct 


acagcgtgac 


cggaccgcag 


gccctcaccc 


agcgcgaaca 


11640 


ggtccgcgcg 


gtaggggagg 


ggctcggccg 


gtccctcgcc 


ttcgtcgagg 


tgacccccgg 


11700 


gcaggcgcgg 


gccgacctga 


ccgcccaggg 


gctgcccgcg 


cccatcgccg 


actacgtcct 


11760 


cgccttccaa 


gccgggtgga 


ccgagcggcc 


cgcccccgcc 


cggccgaccg 


tgcgggaggt 


11820 


caccggccgg 


cccgcccgca 


cgctcgccca 


gtgggccgcc 


gaccaccgag cggacttccg 


11880 


gtgaccggag 


accgcgtcca 


ccgcgccacg 


acagaaaggc 


gacgcccgtg 


cgcgtactgc 


11940 


tgacgtcctt 


cgccatggac 


gcccacttct 


gcaccgccgt 


gccgctggcg 


tgggcactgc 


12000 


ggtcggccgg 


gcacgaggta 


cgggtggccg 


gccagcccgc 


gctcacctcc 


accatcacgg 


12060 


gagccggcct 


gaccgccgtg 


ccggtcggcc 


gcgaccacac 


gcacggcagc 


ctcctgggcc 


12120 


gggtcggcag 


cgacatcctc 


gccctgcacg 


acgaggcgga 


ctacctggag 


gcccgtcacg 


12180 


acgccctggg 


cttcgagttc 


ctcaaagggc 


acaacacggt 


gatgtccgcg 


ttgttctact 


12240 


cgcagatcaa 


caacgactcg 


atggtcgacg 


acctggtgga 


cttcgcccgt 


cactggcggc 


12300 


ccgacctggt 


cgtctgggag 


ccgttcacct 


tcgcgggcgc 


cgtggccgcg 


cgggcctcgg 


12360 


gcgccgccca 


cgcccgcctg 


ctgtccttcc 


ccgacctgtt 


cctcagcacg 


cgccgcctct 


12420 


fccctggagcg 


catggcgcgc 


caggagcccg 


agcatcacga 


cgacacactc 


gccgaatggc 


12480 


tcgactggac 


ccttggccgg 


cacggccact 


ccttcgacga 


ggagatcgtc 


acggggcagt 


12540 


ggfcccatcga 


ccagaccccc 


gcccccgtgc 


ggctcgacgc 


cggcggtccc 


accgtgccga 


12600 


tgcggtacgt 


cccctacagc 


ggactggtgc 


ccacagtggt 


gcccgactgg 


ctgcgcaggc 


12660 


cgcccgagcg 


gccacgggtc 


ctggtcaccc 


tcggcatcac 


ctcacggcgg gtgaagtcct 


12720 


tcctcgccgt 


ctccgtggac 


gaccttttcg 


aggccgtggc 


cgggctcggc 


gtcgaggtgg 


12780 


tcgccaccct 


cgacgccgac 


cagcgggagc 


tgctggggcg 


cgtgccggac 


cacttccgca 


12840 


tcgtcgagca 


cgtgccgctg 


gacgccgttc tgccgacctg 


ctcggcgatc 


gtccaccacg 


12900 


gcggagccgg 


cacctggtcg 


acggccgccg tgtacggggt 


gccgcaggtc 


tccctgggct 


12960 


cgatgtggga 


ccacttctac 


cgggcccgtc 


gcctggagga 


actcggggcg 


gggctgcggc 


13020 


tgccctccgg 


cgagctgact 


gccgaggggc 


tgcgcacccg 


gctggagagg 


gtgctcggcg 


13080 


agccctcctt 


cggcaccgcc 


gcgcaggcgc 


tgagcgacac 


catcgcggcg 


gaacccagcc 


13140 


ccagcgaggt cgtgccggtc ctggaggagc tgaccggacg gcaccgtccc 


ggcacccggg 


13200 


nnnnnnnnnn 


nnnnnnnnnn 


nnnnnnnnnn 


nnnnnnnnnn 


nnnnnnnnnn 


nnnnnnnnnn 


13260 


nnnnnnnnnn 


nnnnnnnnnn 


nnnnnnnnnn 


nnnnnnnnnn 


ccgtccgggc 


ccctcgccgg 


13320 


tgagggagcc 


cggatcacag 


tccgtccggc 


accacgccca 


ggtcccggaa 


cagcggggag 


13380 
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aagttgaaga 


cgtcccagtg 


ctccacgacc 


ttgccggctt 


cggagaagcg 


cagctcctcc 


13440 


aagtaggtcc 


agcggacctt 


gcggccggtg 


ggggcgatgc 


ccatgaacac 


gccctggtgc 


13500 


gtggccgagc 


aggtgatccg 


cagcatcacg 


cggtcgccct 


cgcccacgat 


gctccgcacg 


13560 


tccagacgaa 


ggtccgggaa 


ggcctccacc 


gcgctgttca 


tacgccgtac 


gacctcctcg 


13620 


gcgctcaccg 


gtttgtcctc 


gtcgtcgtag tggacgacgt 


cgggtgccca 


gtgcgcgacc 


13680 


accccggaga 


cgtcccaccg 


gttccatgcg 


gccaccatct 


ccaggcagcg 


ttccttgttc 


13740 


gcggtcgttg 


acatgtcgac 


tccttgaagg 


cccgggacta 


ctggtcacgc 


gccagccttc 


13800 


caacccgccc 


cggaaaagcg 


gtgcacgacc 


gctggagccc 


gcaccggaac 


ctgcgcggcg 


13860 


gagctgaacg 


gggtttcgag 


ccgttcacca 


aggacctgcc 


gcagcctgtt 


acggcacacc 


13920 


ctgacgcctc 


gctccgcgcg 


ggacgcgccc 


gccgggagga 


aggacacacc 


accatgtcgg 


13980 


tacgcaccga 


tcagacggcg 


gcaccggaag 


accgagcggc 


ggccacggat 


cccgggttcg 


14040 


ggcacctgta 


cgcgcaggtg 


cagcagttct 


acgcccggca 


gatgcagctc 


ctcgactccg 


14100 


gcgcggccga 


ggagtgggcc 


gccaccttca 


ccgaggacgg 


cacgttcgcc 


cggccctcct 


14160 


cgccggaacc 


ggcacgcggc 


cacgccgaac 


tggccgccgg 


cgcccgcgcc 


gccgccgaac 


14220 


gcctcgccgc 


cgagggcctt 


tcgcaccggc 


acgtcatcgg 


catgaccgcg 


gtacgccggg 


14280 


aacccgacgg 


cagcgtgttc 


gtacgcagct 


acgcccaggt 


cttcgccacc 


cgccgcgggg 


14340 


aagctccccg 


gctgcatcfcg 


atctgcgtct 


gcgaggacgt 


gctcgtgcgg 


gaggggccgg 


14400 


ggctgaaggt 


gcgggaacgg 


gttgtcacgc 


acgacgcgtg 


agggcggtcg 


acccgccggc 


14460 


cgagccgcac 


ctctgccacc 


ccctcggcac 


gccagccggc 


gtcgagtccg 


ctgcgagagg 


14520 


gcgcacttag 


cgtgcgagcc 


atgactgact 


cgacaggtcc 


ccgcccggtg 


cccgccatgt 


14580 


cacccgcccc 


cagccccacg 


ccttcccccg 


gccccgcccc 


cgggagcgaa 


cccgcgccgc 


14640 


tcgccgtgat 


cgtcaccggc 


ggcggttcgg 


gtatcggccg 


ggccaccgcc 


cgcgccttcg 


14700 


ccgctcaggg 


tgcgaaggtg 


ctcgtcgtcg 


gccgtaccga 


ggacgcgctc 


gcgcagaccg 


14760 


ccgagggctg 


tgcggacatg 


cgtgtgctcg 


tcgccgacgt 


ggcctcgccc 


gacgggccgc 


14820 


aggcggtcgt 


caacgccgcc 


ctgcgggagt 


tcgggaggat 


cgacgtcctg 


gtcaacaacg 


14880 


ctgccgtggc 


gggcatggag 


accctgcaga 


ccgtcgaccg 


ggacgccgtg 


gcacggcagt 


14940 


tcggcaccaa 


tctgacggct 


cccctcttcc 


tcgtccagtc 


cgcactcggc 


gcgctggaga 


15000 


agtcgcgcgg 


catcgtcgtc 


aacgtgggga 


ccgccgcgac 


cctgggcctg 


cgcgccgccc 


15060 


cgaccggcgc 


gctgtacggg 


gcgagcaagg 


tggccctcga 


ctacctgacc 


cggacctggg 


15120 


ccgtcgaact 


ggccccccgg 


ggcatccgtg 


tcgtcggcgt 


ggcacccggg 


gtgatcgaca 


15180 


cgggcatcgg 


cgtccgcatg 


ggcatgaccc 


cggagggcta 


ccgggagttc 


ctgaccggca 


15240 


tgggcggcag 


ggtgcccgtg 


ggccgggtcg 


gccgtccgga 


ggacgtggcc 


tggtggatcg 


15300 


tccagctcgc 


ccgcccggag 


gccggctacg 


cgacgggcat 


ggtcgtcccc 


gtcgacggcg 


15360 


ggctgtcgct 


ggtctgaccg 


gacaaggaag 


gaaataccgc 


aggaaggaag 


taccgcagca 


15420 
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aggaaatacc gcaggaagga gatatcgccg tgcaggaaac cgaacccggc gtccccgcgg 15480 

acctgcccgc cgagagcgac cctgccgccc tggagcgcct cgccgcacgg taccggcggg 15540 

acggctacgt ccacgtcccc ggcgtcctcg acgccgggga ggtcgccgaa tacctggccg 15600 

aggcccgtcg gctcctcgcc cacgaggagt ccgtgcgctg gggctccggc gccggcaccg 15660 

tcatggacta cgtcgccgac gcccagctcg gcagcgacac gatgcgccgc cttgccaccc 15720 

acccgcgcat cgccgccctc gccgagtacc tggccggctc gcccctgagg ctgttcaagc 15780 

tggaggtgct gctcaaggag aacaaggaga aggacgcctc ggtccccacc gccccgcacc 15840 

acgatgcgtt cgccttcccg ttctccaccg ccggcaccgc cctgacggcg tgggtcgcgc 15900 

tggtcgacgt cccggtggaa cgcggctgca tgaccttcgt ccccggatca cacctgctgc 15960 

cggatcccga taccggcgac gagccgtggg ccggggcctt cacccggccg ggagagatct 16020 



<210> 
<211> 
<212> 
<213> 

<220> 
<223> 

<400> 



2 

342 
PRT 



Streptomyces nogalater ATCC 27451 

"translate of snogl, function: aminotransferase 1 
2 



Met Thr Val His Val Trp Asp Tyr Leu Pro Glu Tyr Glu Leu Glu Arg 
1 5 10 15 

Glu Asp lie His Asp Ala Val Glu Thr Val Phe Arg Ser Gly Arq Leu 
20 25 30 

Val Leu Gly Glu Ser Val Arg Gly Phe Glu Ser Glu Phe Ala Ser Phe 
35 40 45 

Gin Gly Val Gly His Ala Val Gly Val Asp Asn Gly Thr Asn Ala Val 
B0 55 60 

Lys Leu Gly Leu Gin Ala Leu Gly Val Gly Pro Gly Asp Glu Val Val 
65 70 75 80 

Thr Val Ser Asn Thr Ala Ala Pro Thr Val Val Ala lie Asp Ser Ala 
85 90 95 - 

Gly Ala Thr Pro Val Phe Val Asp Val Arg Glu Glu Asp Tyr Leu Met 
100 105 no 

Asp Thr Ser Gin Val Glu Ala Val Leu Thr Pro Arg Thr Arg Cvs Leu 
115 120 125 

Leu Pro Val His Leu Tyr Gly Gin Cys Val Asp Met Ala Pro Leu Arg 
130 135 140 

Asp Leu Ala Ala Arg His Asn Leu Val He Leu Glu Asp Cys Ala Gin 
145 150 155 160 

Ala His Gly Ala Arg Arg His Gly Arg Leu Ala Gly Ser Thr Gly Asp 
165 _ 170 175 

Ala Ala Ala Phe Ser Phe Tyr Pro Thr Lys Val Leu Gly Ala Tyr Glv 
180 185 190 
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Asp Gly 


Gly Ala 
195 


Val 


Leu 


Thr 


Asp 
200 


Asp 


Glu 


Arg 


Val 


Ala 
205 


Asp 


Arg 


Leu 


Arg 


Arq 
210 


Leu 


Arq 


Tyr 


Tyr 


Gly 
215 


Met 


Glu 


Ser 


Arg 


Tyr 
220 


Tyr 


Val 


Val 


Glu 


Thr 
225 


Pro 


Gly 


His 


Asn 


Ser 
230 


Arg 


Leu 


Asp 


Glu 


Val 
235 


Gin 


Ala 


Glu 


He 


Leu 
240 


Arq 


Arg 


Lvs 


Leu 


Ser 
245 


Arg 


Leu 


Pro 


Ser 


Tyr 
250 


He 


Glu 


Ala 


Arg 


Arg 
255 


Ala 


Val 


Ala 


Arg 


Arq 
260 


Tyr 


Glu 


Glu 


Gly Leu Ala Asp Thr Gly 
265 


Leu 
270 


Leu 


Leu 


Pro 


Arg 


Thr 
275 


Ala 


Gin 


Gly 


Asn 


Glu 
280 


His 


Val 


Tyr 


Xyr 


Val 
285 


Tyr 


Val 


Val 


Arg 


His 
290 


Pro 


Arg 


Arg 


Asp Ala 
295 


Val 


Leu 


Glu 


Ala 


Leu 
300 


Arg 


Ala 


Ser 


Tyr 


Asp 
305 


He 


Ala 


Leu 


Asn 


He 
310 


Ser 


Tyr 


Pro 


Trp 


Pro 
315 


Val 


His 


Thr 


Met 


Thr 
320 


Gly 


Phe 


Ser 


His 


Leu 
325 


Gly 


Tyr 


Ala 


Lys 


Gly 
330 


Ser 


Leu 


Pro 


Val 


Thr 
335 


Glu 


Ala 


Leu 


Ala 


Asp 
340 


Glu 


He 






















<210> 
<211> 
<212> 
<213> 


3 

293 
PRT 

Streptomyces nogalater ATCC 


27451 














<220> 
<223> 


"translate 


of snogJ 


f function: 


dTDP 


-glucose 


synthase" 




<400> 


3 






























Val 
1 


Lys 


Gly 


He 


He 
5 


Leu 


Ala 


Gly 


Gly 


Thr 
10 


Gly 


Ser 


Arg 


Leu 


His 
15 


Pro 


Thr 


Thr 


Leu 


Ala 
20 


Val 


Ser 


Lys 


Gin 


Leu 
25 


Leu 


Pro 


Val 


Gly 


Asp 
30 


Lys 


Pro 


Met 


He 


Tyr 
35 


Tyr 


Pro 


Leu 


Ser 


Val 
40 


Leu 


Met 


Leu 


Ala 


Gly 
45 


Val 


Thr 


Asp 


lie 


Leu 
50 


He 


He 


Ser 


Thr 


Pro 
55 


His 


Glu 


Leu 


Pro 


Arg 
60 


Met 


Arg 


Arg 


Leu 


Phe 
65 


Gly 


Asp 


Gly 


Ala 


Gin 
70 


Leu 


Gly 


Leu 


Arg 


Leu 
75 


Ala 


Tyr 


Ala 


Glu 


Gin 
80 


Glu 


Lys 


Pro 


Arg Gly 
85 


He 


Ala 


Glu 


Ala 


Phe 
90 


Leu 


He 


Gly 


Ala 


Asp 
95 


His 


Val 


Gly 


Ser 


Asp Ala 
100 


Val 


Ala 


Leu 


Ala 
105 


Leu 


Gly 


Asp 


Asn 


He 
110 


Phe 


His 


Gly 


Ser 


Ser 
115 


Phe 


Gin 


Gly Val 


Leu 
120 


Arg 


Lys 


Glu 


Ala 


Glu 
125 


Glu 


Leu 


Asp 


Gly 


Cys 
130 


Val 


Leu 


Phe 


Gly Tyr 
135 


Pro 


Val 


Lys 


Asp 


Pro 
140 


Gin 


Arg 


Tyr 


Gly 
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Val Gly Glu Ala Asn Ala Ser Gly Arg Leu Val Ser lie Glu Glu Lvs 
145 150 155 160 

Pro Val Arg Pro Arg Ser Asn Arg Ala He Thr Gly Leu Tyr Phe Tyr 
165 170 175 

Asp Asn Glu Val Val Asp He Ala Arg Arg Leu Arg Pro Ser Ala Arq 
180 185 190 

Gly Glu Leu Glu He Thr Asp He Asn Arg Thr Tyr Met Glu Arg Gly 
195 200 205 

Arg Ala Arg Leu Val Asp Leu Gly Arg Gly Phe Ala Trp Leu Asp Thr 
210 215 220 

Gly Thr Pro Glu Ser Leu Leu Gin Ala Ser Gin Tyr Val Ser Ala Leu 
225 230 235 240 

Glu Glu Arg Gin Gly He Arg He Ala Cys He Glu Glu Val Ala Leu 
245 250 255 

Arg Met Gly Phe He Asn Ala Gin Ala Cys Tyr Glu Leu Gly Ala Ara 
260 265 270 

Leu Ser Gly Ser Gly Tyr Gly Gin Tyr Val Met Ala He Ala Glu Glu 
275 280 285 

Cys Thr Gly Arg Val 
290 

<210> 4 
<211> 238 
<212> PRT 

<213> Streptomyces nogalater ATCC 27451 

<220> 

<223> "translate of snogA, function: aminomethyl transferase" 
<400> 4 

Val Tyr Gly Arg Glu Leu Ala Asp Val Tyr Glu Met Val Tyr Arg Ser 
15 10 15 

Arg Gly Lys Ser Trp Ala Asp Glu Ala Glu Arg Val Thr Ala Glu He 
20 25 _. 30 

Arg Ser Arg Arg Pro Gly Ala Arg Ser Leu Leu Asp Val Ala Cys Gly 
35 40 45 

Thr Gly Ala His Leu Glu Ala Phe Arg Gly Leu Phe Ala His Thr Glu 
50 55 60 

Gly Leu Glu Leu Ser Asp Glu Met Arg Ala Leu Ala Glu Arg Ara Leu 
65 70 75 80 

Pro Gly Val Pro Val Arg Pro Gly Asp Met Arg Asp Phe Ala Leu Ser 
85 90 95 

Gly Arg Phe Asp Ala Val Val Cys Leu Phe Cys Ser He Gly Tyr Leu 
100 105 no 

Glu Thr Val Ala Asp Met Arg Ala Ala Val Arg Thr Met Ala Ala His 
115 120 125 

Leu Val Pro Gly Gly Val Leu Val Val Glu Pro Trp Trp Phe Pro Glu 
I 30 135 140 
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Arg 
145 


rne 


Leu 


CjIU 


c?iy 


Tyr 
150 


val 


Ala 


Gly 


Asp 


Leu 
155 


Ala 


Arg 


Gly 


Glu 


Arg 


Thr 


Val 


Ala 


Arg 
165 


Val 


Ser 


His 


Ser 


Thr 
170 


Arg 


Gin 


Gly 


Arg 


Arg 
175 


Arg 


Met 


Glu 


Val 
180 


Arg 


Phe 


Leu 


Val 


Gly 
185 


Glu 


Ala 


Thr 


Gly 


He 
190 


Arg 


Phe 


Thr 


Glu 
195 


lie 


Asp 


Leu 


Leu 


Thr 
200 


Leu 


Phe 


Thr 


Arg 


Glu 
205 


Glu 


Tyr 


Ala 


Ala 
210 


Phe 


Glu 


Asp 


Ala 


Gly 
215 


Cys 


Pro 


Ala 


Glu 


Phe 
220 


Leu 


Asp 


Asp 


Leu 
225 


Thr 


Gly 


Arg 


Gly 


Leu 
230 


Phe 


Val 


Gly 


Val 


Arg 
235 


Gly 


Ala 


Gly 




<210> 
<211> 
<212> 
<213> 


5 

324 
PRT 

Streptomyces nogalater ATCC 


27451 












<220> 
<223> 


"translate 


of snoaM 


, function: 


polyketide cyclase" 




<400> 


5 




























Met 
1 


Thr 


Ala 


Ala 


Trp Gly 
5 


Ala 


Pro 


Leu 


Tyr 


Pro 


Pro 


Trp 


He 


Pro 
15 


Arg 


Pro 


Gly 


Arg 
20 


Arg 


Arg 


Cys 


Gly Ala 
25 


Gly 


Arg 


Arg 


Val 


Arg 
30 


Cys 


Pro 


Val 


Glu 
35 


Pro 


Ala 


Ser 


Arg 


Pro 
40 


Arg 


Gin 


Glu 


Gly 


Arg 
45 


Val 


Ser 


Val 


Pro 
50 


Ala 


Leu 


Arg 


Gin 


Pro 
55 


Ser 


Pro 


Ser 


Thr 


Asn 
60 


Pro 


Glu 


Val 


val 
65 


Arg 


Leu 


He 


Asp 


Leu 
70 


Ser 


Ser 


Pro 


Val 


Asp 
75 


Ser 


Ser 


Gin 


Tyr 


Pro 


Asp 


Pro 


Val 


Val 
85 


His 


Asp 


Val 


Leu 


Thr 
90 


Pro 


Arg 


Gin 


Gly Ala 
95 


His 


Met 


Cys 


Ala 
100 


Glu 


Met 


Arg 


Glu 


His 
105 


Phe 


Gly Val 


Glu 


Phe 
110 


Ser 


Asp 


Glu 


Leu 
115 


Pro 


Asp Gly 


Glu 


Phe 
120 


Leu 


Ser 


Leu 


Asp 


Arg 
125 


He 


Thr 


inr 


Thr 
130 


His 


Thr 


Gly 


Thr 


His 
135 


Val 


Asp 


Ala 


Pro 


Ser 
140 


His 


Tyr 


Gly 


Arg 
145 


Ala 


Leu 


Tyr 


Gly 


Asp 
150 


Gly 


Val 


Pro 


Arg 


His 
155 


lie 


Asp 


Gin 


Met 


Leu 


Glu 


Trp 


Phe 


Phe 
165 


Gly 


Arg Gly 


Val 


Val 
170 


Leu 


Asp 


Leu 


Thr 


Asp 
175 


Pro 


Thr 


Gly 


Thr 
180 


Val 


Ser 


Ala 


Ala 


Arg 
185 


Leu 


Glu 


Lys 


Glu 


Leu 
190 


Ala . 


Thr 


Gly 


Cys 
195 


Ala 


Leu 


Arg 


Pro 


Gly Asp 
200 


He 


Val 


Leu 


Leu 
205 


His 


Thr 



160 



80 



160 
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Al a 


210 




Hi <=; 

nx » 


Ala 


Gly 


Thr 

X li-L 

215 


-L X. l_> 


Arg 


lyr 


Phe 


Thr 
220 


Asp 


Phe 


Ala 


Gly 


T .01 1 

225 






Pro 


Al a 


V ctx 

230 


A y-ft 
■fix. y 


i. 1C u 


Leu 


Leu 


Asp His 
235 


Gly Val 


Arg 


val 
240 


T T — v 

J. le 


\jiy 


inr 


Asp 


Ala 

245 


rne 


Ser 


Leu 


Asp 


Ala 
250 


Pro 


Phe 


Gly 


His 


He 
255 


He 


Asp 


Arg 


Tyr 


Arg 
260 


Ala 


inr 


LrXy 


Asp 


Arg 
265 


Ser 


Val 


Leu 


Trp 


Pro 
270 


Ala 


His 


vai 


vai 


L»xy 
275 


Arg 


CjIU 


Arg 


vjIU 


Tyr 
280 


Cys 


Gin 


He 


Glu 


Arg 
285 


Leu 


Ala 


Asn 


Leu 


Asp 
290 


Arg 


Leu 


Pro 


Val 


Ser 
295 


Phe 


Gly 


Phe 


Arg 


Val 
300 


Cys 


Cys 


Phe 


Pro 


Val 


Lys 


Val 


Ala 


Gly 


Ala 
310 


Gly 


Ala 


Gly 


Trp 


Thr 
315 


Arq 


Ala 


Val 


Ala 


Leu 
320 


Val 


Asp 


Glu 


Asp 


























<210> 

<212> 
<213> 


6 

408 
PRT 

Streptomyces nog a later ATCC 


27451 














<220> 
<223> 


"translate 


of snogN 


, function: 


unknown" 












<400> 


6 






























nut 
1 


Val 


Met 


Lys 


Leu 
5 


Thr 


Asp 


Ser 


Glu 


Leu 
10 


Gly 


Ara 


Ala 


Leu 


Leu 
15 


Ser 


Leu 


Arg 


Gly 


Tyr 
20 


Gin 


Trp 


Leu 


Arg 


Gly 
25 


He 


His 


His 


Asp 


Pro 
30 


Tyr 


Ala 


Leu 


Leu 


Leu 
35 


Arg 


Ala 


Glu 


Ser 


Asp 
40 


Asp 


Pro 


Ala 


Gin 


Leu 
45 


Gly Arg Leu 


Leu 


Arg 
50 


Glu 


Arg Gly 


Arg 


Leu 
55 


His 


Arg 


Ser 


Asp 


Thr 
60 


Glv 


Thr 


Trp Val 


65 


Ala 


Asp 


His 


Ala 


Thr 
70 


Ala 


Ser 


Arg 


Leu 


Leu 
75 


Ala 


Asp 


Pro 


Arg 


Phe 
80 


va± 


Leu 


Arg 


Arg 


Pro 
85 


Pro 


Ala 


Gly 


Pro 


Ala 
90 


Thr 


Gly 


Thr 


Gly Asp Val 
95 


M6L 


Pro 


Trp 


Glu 
100 


Glu 


Ala 


Thr 


Leu 


Ser 
105 


Asp 


Leu 


Leu 


Pro 


Leu 
110 


Asp 


Glu 


Ala 


Arg 


Leu 
115 


Thr 


Thr 


Asp 


Arg 


Ala 
120 


Arg 


Cys 


Arg 


Arg 


Leu 
125 


Gly Ala 


Thr 


Ala 


Ala 
130 


Arg 


He 


Ala 


Ala 


Asp 
135 


Gly 


Pro 


Val 


Ala 


Thr 
140 


Arg 


Leu 


Ala 


Asp 


Leu 
145 


Ala 


Gly 


Ala 


Arg 


Ala 
150 


Glu 


Gin 


Val 


Arg 


Ser 
155 


Thr 


Gly His 


Phe 


Asp 
160 


Leu 


Arg 


Ala 


Asp 


Tyr 
165 


Ala 


Leu 


Pro 


Tyr 


Ala 
170 


Val 


Glu 


Pro 


Ala 


Cys 
175 


Ala 
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Leu Leu Gly Leu Pro Ala Gly Gin Cys Ser Leu Phe Gly Ala Phe Ser 
180 185 190 

Pro Ala Val Leu Leu Asp Ala Thr Val Val Pro Pro Arg Leu Pro Glu 
195 200 205 

Ala Arg Ala Leu lie Ala Ser Thr Ala Glu Leu Thr Ala Leu Trp Pro 
210 215 220 

Arg Leu Ala Pro Ser Leu Ser Lys Thr Val Pro Glu Asp Glu Ala Pro 
225 230 235 240 

Asp Leu Phe Leu Leu Thr Ala Val Leu Leu Val Pro Ala Val Val His 
245 250 255 

Leu Val Cys Glu Ala Val Ala Ala Leu Ser His Asp Pro Gly Gin Ala 
260 265 270 

Gly Leu Leu Arg Asp Asp Pro Val Leu Ala Ala Pro Ala Val Glu Glu 
275 280 285 

Thr Leu Arg His Ala Pro Pro Ala Arg Leu Phe Thr Leu His Ala Thr 
290 295 300 

Gly Pro Glu Arg Val Ala Asp Val Asp Leu Pro Ala Gly Ala Glu Val 
305 310 315 320 

Ala Val Val Val Ala Ala Ala His Arg Asp Pro Ser Trp Cys Pro Asp 
325 330 335 

Pro Asp Arg Phe Asp Leu Thr Arg Asn Glu Arg His Leu Ala Leu Pro 
340 345 350 

Pro Asp Leu Pro Leu Gly Ala Leu Ala Pro Leu Leu Arg Val Cys Ala 
355 360 365 

Thr Ala Ala Val Ala Ala Leu Ala Ala Gly Leu Leu Pro Leu Arq Ala 
370 375 380 

Val Gly Pro Pro Val Arg Arg Leu Arg Ala Pro Val Thr Arg Ser Val 
385 390 395 400 

Leu Arg Phe Pro Val Ala Pro Cys 
405 

<210> 7 
<211> 422 
<212> PRT 

<213> Streptomyaes nogalater ATCC 27451 
<220> 

<223> "translate of s^oaG, function: hydroxylase" 
<400> 7 

Met Asp Asn Arg Glu Thr Val Arg Pro Val Ser Val Cys Arg Val Cys 
1 5 10 15 

Gly Gly Asn Asp Trp Gin Asp Val Val Asp Phe Gly Asp Val Pro Leu 
20 25 30 

Ala Asn Gly Phe Leu Ser Pro Ala Asp Ser Tyr Glu Asn Glu Arg Arg 
35 40 45 

Tyr Pro Leu Gly Val Leu Ser Cys Arg Ala Cys Arg Leu Met Ser Leu 
50 55 60 
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Thr His Val Val Asp Pro Glu Val Leu Tyr Arg Asp Tyr Ala Tyr Thr 
65 70 75 80 

Thr Pro Asp Ser Glu Met lie Thr Gin His Met Arg His He Thr Ala 
85 90 95 

Leu Cys Arg Thr Arg Phe Glu Leu Pro Pro Asp Ser Leu Val Val Glu 
100 105 no 

Leu Gly Ser Asn Thr Gly Arg Gin Leu Met Ala Phe Arg Glu Ala Gly 
115 120 125 

Met Arg Thr Leu Gly Val Asp Pro Ala Arg Asn Leu Thr Asp Val Ala 
130 135 140 

Arg Arg Asn Gly He Glu Thr Phe Pro Asp Phe Phe Ser His Asp Val 
145 150 155 160 

Ala Arg Thr He Arg Arg Asp His Gly Gin Ala Arg Leu Val Leu Gly 
165 170 175 

Arg His Val Phe Ala His He Asp Asp Val Ser Asp He Ala Ala Gly 
180 185 190 

Val Arg Glu Leu Leu Ser Pro Asp Gly Val Phe Ala He Glu Val Pro 
195 200 205 

Tyr Val Leu Asp Leu Leu Glu Lys Val Ala Phe Asp Thr He Tyr His 
210 215 220 

Glu His Leu Ser Tyr Phe Thr Met Arg Ser Phe Val Thr Leu Phe Ala 
225 230 235 240 

Arg His Gly Leu Arg Val Leu Asp Val Glu Arg Phe Gly Val His Gly 
245 250 255 

Gly Ser Val Leu Val Phe Val Gly His Glu Asp Gly Pro Trp Pro Glu 
260 265 270 

Arg Pro Ser Val Pro Glu Leu Leu Arg Val Glu Arg Gin Arg Gly Leu 
275 280 285 

Tyr Asp Asp Ala Thr Tyr Arg Thr Phe Ala Gin Arg He Glu Arg Val 
290 295 300 

Arg Thr Glu Leu Pro Glu Leu Leu Arg Ser Leu Val Ala Gin Gly Lys 
305 310 315 320 

Arg He Val Gly Tyr Gly Ala Pro Ala Lys Gly Asn Thr He Leu Thr 
325 330 335 

Val Cys Gly Leu Gly Leu Lys Glu Leu Glu Tyr Cys Thr Asp Thr Thr 
340 345 350 

Glu Leu Lys Gin Gly Arg Val Leu Pro Gly Thr His He Pro Val His 
355 360 365 

Ala Pro Glu His Ala Lys Glu His He Pro Asp Tyr Tyr Leu Leu Leu 
370 375 380 

Ala Trp Asn Tyr Ala Thr Glu He Leu Asp Lys Glu Thr Ala Phe Aro 
385 390 395 400 

Asp Asn Gly Gly Arg Phe He Val Pro He Pro Arg Pro Ser He Leu 
405 410 415 
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Thr 


Ser Pro Ser Gly Ser 




420 


<210> 


8 


<-0 lis 


^ 7 X 


<212> 


PRT 


<213> 


Streptomyces nogalater ATCC 27451 


<220> 




<223> 


"translate of snogC, function: dTDP- 


<400> 


8 



Met Leu Ala Arg His Leu Thr Ala Ala Leu Ala Glu Thr Gly Arg Ser 
15 10 15 

Arg Pro Ala Ala Glu Ala Val Val Leu Gly Arg Arg Ala Leu Asp He 
20 25 30 

Thr Asp Gly Arg Ala Val Asp Ala Ala Phe Ala Ala His Arg Pro Arg 
35 40 45 

Val Val Val Asn Cys Ala Ala Phe Thr Asp Val Asp Gly Ala Glu Ser 
50 55 60 

Arg Trp Ala Glu Ala Met Arg Val Asn Gly Gly Gly Pro Arg Leu Leu 
65 70 75 80 

Ala Arg Arg Cys Ala Arg His Gly Val Arg Leu He His Val Ser Thr 
85 90 95 

Asp Tyr Val Phe Pro Gly Asp Thr Arg Ser Pro Tyr Gly Glu Ser Asp 
100 105 HO 

Ala Pro Gly Pro Arg Thr Val Tyr Gly Arg Ser Lys Leu Ala Gly Glu 
115 120 125 

Arg Ala Val Leu Ser Leu Leu Pro Asp Thr Gly Thr Val Val Arg Thr 
130 135 140 

Ala Trp Leu Tyr Gly Gly Gin Gly Arg Ser Phe Val Arg Thr Met Leu 
145 150 155 160 

Glu Arg Ala Pro Asp Asp Gly His Val Asp Val Val Asn Asp Gin Trp 
165 170 175 

Gly Gin Pro Thr Trp Ala Gly Asp Val Ala Arg Leu Leu Val Thr Leu 
180 185 190 

Ala Arg Thr Pro Pro Asp Arg Ala Arg Gly He Phe His Ala Thr Asn 
195 200 205 

Ala Gly Ala Ala Thr Trp Tyr Glu Leu Ala Arg Glu Val Phe Arg Leu 
210 215 220 

Ala Gly Ala Asp Pro Glu Arg Val Arg Pro Val Ala Thr Ala Asp Arg 
225 230 235 240 

Pro Gly Pro Ala Pro Arg Pro Ala Cys Thr Val Leu Gly His Asp Arg 
245 250 255 

Trp Arg Leu Val Gly Val Ala Pro Pro Arg Asp Trp Arg Ala Ala Leu 
260 265 270 

Arg Glu Ala Met Arg Gin Leu Leu Pro Gly Gly Arg Leu Arg Asn Leu 
275 280 285 
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Thr Gly Thr 
290 

<210> 9 
<211> 350 
<212> PRT 

<213> Streptomyces nogalater ATCC 27451 

<220> 

<223> "translate of snogK, function: dTDP-glucose-4, 6-dehydratase r 
<400> 9 

Met Ala Ser His Thr Ser Ala Thr Thr Asp Val Asn lie Leu Val Thr 
15 10 15 

Gly Ala Val Gly Phe lie Gly Ser Ala Tyr Val Arg Met Leu Leu Glu 
20 25 30 

Asn Arg Ala Pro Gly Ala Gly Ala Pro Ala Val Arg Val Thr Val Leu 
35 40 45 

Asp Lys Leu Thr Tyr Ala Gly Asn Leu Thr Asn Leu Asp Ala Val Arg 
50 55 60 

Gly Asp Arg Leu Arg Phe Val Arg Gly Asp lie Leu Asp Ala Glu Leu 
65 70 75 80 

Val Asp Glu Leu Met Ala His Ser Asp Gin Val Val His Phe Ala Ala 
85 90 95 

Glu Ser His Val Asp Arg Ser lie Arg Ala Ala Asp Asp Phe Val Leu 
100 105 110 

Thr Asn Val Val Gly Thr Gin Arg Leu Leu Asp Ala Ala Leu Arg His 
115 120 125 

Gly Val Glu Pro Phe Val Leu Val Ser Thr Asp Glu Val Tyr Gly Ser 
130 135 140 

lie Ala Ser Gly Ser Trp Pro Glu Glu His Pro Leu Ser Pro Asn Ser 
145 150 155 160 

Pro Tyr Ala Ala Ser Lys Ala Ser Ala Asp Leu Met Ala Phe Ala Cys 
165 170 175 

His Arg Thr His Gly Leu Asp Val Arg Val Thr Arg Cys Ser Asn Asn 
180 185 190 

Tyr Gly Pro Arg Gin His Pro Glu Lys Leu lie Pro Arg Phe Val Thr 
195 200 205 

Asn Leu Leu Asp Gly Leu Pro Val Pro Leu Tyr Gly Asp Gly Arg Asn 
210 215 220 

Val Arg Glu Trp Leu His Val Glu Asp His Cys Arg Gly Val Asp Leu 
225 230 235 240 

Val Arg Thr Ala Gly Arg Pro Gly Gly Val Tyr His lie Gly Gly Gly 
245 250 255 

Arg Glu Leu Ser Asn Arg Glu Leu Val Gly Met Leu Leu Glu Leu Cys 
260 265 270 

Gly Ala Asp Trp Ser Ser Val Arg His Val Pro Asp Arg Lys Gly His 
275 280 285 
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Asp Leu Arg Tyr Ser Leu Asp Trp Gly Arg Ala Arg Glu Glu Leu Gly 
290 295 300 

Tyr Arg Pro Ala Arg Glu Phe Ser Ser Gly Leu Arg Ser Thr Val Gin 
305 310 315 320 

Trp Tyr Arg Glu Asn Arg Ser Trp Trp Glu Pro Leu Lys Arg Gly Val 
325 330 335 

Thr Ala Pro Gly Gly Thr Ser Thr Val Val Pro Gly Val Arg 
340 345 350 

<210> 10 
<211> 134 
<212> PRT 

<213> Str&ptomyces nogalater ATCC 27451 
<220> 

<223> "translate of snoal,, function: NAME cyclase" 

<400> 10 

Met Val Ser Ala Phe Asn Thr Gly Arg Thr Asp Asp Val Asp Glu Tyr 
15 10 15 

He His Pro Asp Tyr Leu Asn Pro Ala Thr Leu Glu His Gly lie His 
20 25 30 

Thr Gly Pro Lys Ala Phe Ala Gin Leu Val Gly Trp Val Arg Ala Thr 
35 40 45 

Phe Ser Glu Glu Ala Arg Leu Glu Glu Val Arg He Glu Glu Arg Gly 
50 55 60 

Pro Trp Val Lys Ala Tyr Leu Val Leu Tyr Gly Arg His Val Gly Arg 
65 70 75 80 

Leu Val Gly Met Pro Pro Thr Asp Arg Arg Phe Ser Gly Glu Gin Val 
85 90 95 

His Leu Met Arg He Val Asp Gly Lys He Arg Asp His Arg Asp Trp 
100 105 110 

Pro Asp Phe Gin Gly Thr Leu Arg Gin Leu Gly Asp Pro Trp Pro Asp 
115 120 JL2S 

Asp Glu Gly Trp Arg Pro 
130 

<210> 11 
<211> 235 
<212> PRT 

<213> Streptomyces nogalater ATCC 27451 
<220> 

<223> "translate of snoK, function: unknown" 
<400> 11 

Met Pro Asp Pro Gly Gly Pro Thr Thr Ala Glu Asn Leu Ser Lys Glu 
15 10 15 

Ala Val Arg Phe Tyr Arg Glu Gin Gly Tyr Val His He Pro Arg Val 
20 25 30 

Leu Ser Glu Thr Glu Val Thr Ala Phe Arg Ala Ala Cys Glu Glu Val 
35 40 45 
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Leu Glu Lys Glu Gly Arg Glu lie Ser Gly lie Ala Leu Arg Leu Ala 
50 55 60 

Gly Ala Pro Leu Arg Val Tyr Ser Ser Asp lie Leu Val Lys Glu Pro 
65 70 75 80 

Lys Arg Thr Leu Pro Thr Leu Val His Asp Asp Glu Thr Gly Leu Pro 
85 90 95 

Leu Asn Glu Leu Ser Ala Thr Leu Thr Ala Trp lie Ala Leu Thr Asp 
100 105 110 

Val Pro Val Glu Arg Gly Cys Met Ser Tyr Val Pro Gly Ser His Leu 
115 120 125 

Arg Ala Arg Glu Asp Arg Gin Glu His Met Thr Ser Phe Ala Glu Phe 
130 135 140 

Arg Asp Leu Ala Asp Val Trp Pro Asp Tyr Pro Trp Gin Pro Arg Val 
145 150 155 160 

Ala Val Pro Val Arg Ala Gly Asp Val Val Phe His His Cys Arg Thr 
165 170 175 

Val His Met Ala Glu Ala Asn Thr Ser Asp Ser Val Arg Met Ala His 
180 185 190 

Gly Val Val Tyr Met Asp Ala Asp Ala Thr Tyr Arg Pro Gly Val Gin 
195 200 205 

Asp Gly His Leu Ser Arg Leu Ser Pro Gly Asp Pro Leu Glu Gly Glu 
210 215 220 

Leu Phe Pro Leu Val Thr Ala Gly Thr Arg Gin 
225 230 235 

<210> 12 
<211> 390 
<212> PRT 

<213> Streptomyces nogalater ATCC 27451 
<220> 

<223> "translate of snogD , function: glycosyl transferase" 

<400> 12 

Met Arg Val Pro Gly Ser Cys Arg Thr Gly Gly lie Met Arg Ala Leu 
15 10 15 

Phe lie Thr Ser Pro Gly Leu Ser His lie Leu Pro Thr Val Pro Leu 
20 25 30 

Ala Gin Ala Leu Arg Ala Leu Gly His Glu Val Arg Tyr Ala Thr Gly 
35 40 45 

Gly Asp lie Arg Ala Val Ala Glu Ala Gly Leu Cys Ala Val Asp Val 
50 55 60 

Ser Pro Gly Val Asn Tyr Ala Lys Leu Phe Val Pro Asp Asp Thr Asp 
65 70 75 80 

Val Thr Asp Pro Met His Ser Glu Gly Leu Gly Glu Gly Phe Phe Ala 
85 90 95 

Glu Met Phe Ala Arg Val Ser Ala Val Ala Val Asp Gly Ala Leu Arg 
100 105 110 
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Thr Ala Arg Ser Trp Arg Pro Asp Leu Val Val His Thr Pro Thr Gin 
115 120 125 

Gly Ala Gly Pro Leu Thr Ala Ala Ala Leu Gin Leu Pro Cys Val Glu 
130 135 140 

Leu Pro Leu Gly Pro Ala Asp Ser Glu Pro Gly Leu Gly Ala Leu lie 
145 150 155 160 

Arcr Arq Ala Met Ser Lys Asp Tyr Glu Arg His Gly Val Thr Gly Glu 
y 165 170 175 

Pro Thr Gly Ser Val Arg Leu Thr Thr Thr Pro Pro Ser Val Glu Ala 
180 185 190 

Leu Leu Pro Glu Asp Arg Arg Ser Pro Gly Ala Trp Pro Met Arg Tyr 
195 200 205 

Val Pro Tyr Asn Gly Gly Ala Val Leu Pro Asp Trp Leu Pro Pro Ala 
210 215 220 

Ala Gly Arg Arg Arg lie Ala Val Thr Leu Gly Ser lie Asp Ala Leu 
225 230 235 240 

Ser Glv Gly lie Ala Lys Leu Ala Pro Leu Phe Ser Glu Val Ala Asp 
245 250 255 

Val Asp Ala Glu Phe Val Leu Thr Leu Gly Gly Gly Asp Leu Ala Leu 
260 265 270 

Leu Gly Glu Leu Pro Ala Asn Val Pro Val Val Glu Trp lie Pro Leu 
275 280 285 

Gly Ala Leu Leu Glu Thr Cys Asp Ala He He His His Gly Gly Ser 
290 295 300 

Gly Thr Leu Leu Thr Ala Leu Ala Ala Gly Val Pro Gin Cys Val He 
305 310 315 320 

Pro His Gly Ser Tyr Gin Asp Thr Asn Arg Asp Val Leu Thr Gly Leu 
325 330 335 

Gly He Gly Phe Asp Ala Glu Ala Gly Ser Leu Gly Ala Glu Gin Cys 
340 345 350 

Arg Arg Leu Leu Asp Asp Ala Gly Leu Arg Glu Ala Ala Leu Arg Val 
355 360 365 

Arg Gin Glu Met Ser Glu Met Pro Pro Pro Ala Glu Thr Ala Ala Lys 
370 375 380 

Leu Val Ala Leu Ala Gly 
385 390 

<210> 13 

<211> 275 

<212> PRT 

<213> Streptomyces nogalatex ATCC 27451 
<220> 

<223> "translate of snow, function: unknown" 

<400> 13 

Met Thr Val Leu Val Thr Gly Ala Thr Gly Asn Val Gly Arg His Val 
15 10 15 
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Val Thr Gly Leu Leu Ala Ala Gly Arg Arg Val Arg Ala Leu Thr Arg 
20 25 30 

Thr Pro Asp Arg Ser Gly Leu Pro Gly Gly Ala Glu lie Thr Gly Gly 
35 40 45 

Asp Leu Thr Arg Pro Glu Thr Tyr Glu Arg Met Leu Asp Gly Val Glu 
SO 55 60 

Ala Val Tyr Leu Phe Pro Val Pro Glu Thr Ala Ala Ala Phe Ala Gly 
65 70 75 80 

Ala Ala Arg Arg Ala Gly Val Arg Arg lie Val Val Leu Ser Ser Asp 
85 90 95 

Ser Val Thr Asp Gly Thr Asp Thr Gly Gly His Arg Arg Val Glu Leu 
100 105 110 

Ala Val Glu Asp Thr Gly Leu Glu Trp Thr His Val Arg Pro Gly Glu 
115 120 125 

Phe Ala Leu Asn Lys Val Thr Leu Trp Ala Pro Ser lie Arg Ala Glu 
130 135 140 

Gly Val Val Arg Ser Ala Tyr Pro Asp Ala Arg Val Ala Pro Val His 
145 150 155 160 

Glu Ala Asp Val Ala Ala Val Ala Val Thr Ala Leu Leu Lys Glu Gly 
165 170 175 

His Ala Gly Arg Ala Tyr Ser Val Thr Gly Pro Gin Ala Leu Thr Gin 
180 185 190 

Arg Glu Gin Val Arg Ala Val Gly Glu Gly Leu Gly Arg Ser Leu Ala 
195 200 205 

Phe Val Glu Val Thr Pro Gly Gin Ala Arg Ala Asp Leu Thr Ala Gin 
210 215 220 

Gly Leu Pro Ala Pro lie Ala Asp Tyr Val Leu Ala Phe Gin Ala Gly 
225 230 235 240 

Trp Thr Glu Arg Pro Ala Pro Ala Arg Pro Thr Val Arg Glu Val Thr 
245 250 255 

Gly Arg Pro Ala Arg Thr Leu Ala Gin Trp Ala Ala Asp His Arg Ala 
260 265 270 

Asp Phe Arg 
275 

<210> 14 
<211> 424 
<212> PRT 

<213> Streptomyces nogalater ATCC 27451 
<220> 

<223> "translate of snogE, function: glycosyl transferase" 
<400> 14 

Val. Arg Val Leu Leu Thr Ser Phe Ala Met Asp Ala His Phe Cys Thr 
15 10 15 

Ala Val Pro Leu Ala Trp Ala Leu Arg Ser Ala Gly His Glu Val Arg 
20 25 30 
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Val Ala Gly Gin Pro Ala Leu Thr Ser Thr lie Thr Gly Ala Gly Leu 
35 40 45 

Thr Ala Val Pro Val Gly Arg Asp His Thr His Gly Ser Leu Leu Gly 
50 55 60 

Arc* Val Gly Ser Asp lie Leu Ala Leu His Asp Glu Ala Asp Tyr Leu 
65 70 75 80 

Glu Ala Arg His Asp Ala Leu Gly Phe Glu Phe Leu Lys Gly His Asn 
85 90 95 

Thr Val Met Ser Ala Leu Phe Tyr Ser Gin lie Asn Asn Asp Ser Met 
100 105 110 

Val Asp Asp Leu Val Asp Phe Ala Arg His Trp Arg Pro Asp Leu Val 
115 120 125 

Val Trp Glu Pro Phe Thr Phe Ala Gly Ala Val Ala Ala Arg Ala Ser 
130 135 140 

Gly Ala Ala His Ala Arg Leu Leu Ser Phe Pro Asp Leu Phe Leu Ser 
145 150 155 160 

Thr Arg Arg Leu Phe Leu Glu Arg Met Ala Arg Gin Glu Pro Glu His 
165 170 175 

His Asp Asp Thr Leu Ala Glu Trp Leu Asp Trp Thr Leu Gly Arg His 
180 185 190 

Gly His Ser Phe Asp Glu Glu He Val Thr Gly Gin Trp Ser He Asp 
195 200 205 

Gin Thr Pro Ala Pro Val Arg Leu Asp Ala Gly Gly Pro Thr Val Pro 
210 215 220 

Met Arg Tyr Val Pro Tyr Ser Gly Leu Val Pro Thr Val Val Pro Asp 
225 230 235 240 

Trp Leu Arg Arg Pro Pro Glu Arg Pro Arg Val Leu Val Thr Leu Gly 
245 250 255 

He Thr Ser Arg Arg Val Lys Ser Phe Leu Ala Val Ser Val Asp Asp 
260 265 270 

Leu Phe Glu Ala Val Ala Gly Leu Gly Val Glu Val Val Ala Thr Leu 
275 280 285 

Asp Ala Asp Gin Arg Glu Leu Leu Gly Arg Val Pro Asp His Phe Arg 
290 295 300 

He Val Glu His Val Pro Leu Asp Ala Val Leu Pro Thr Cys Ser Ala 
305 310 315 320 

He Val His His Gly Gly Ala Gly Thr Trp Ser Thr Ala Ala Val Tyr 
325 330 335 

Gly Val Pro Gin Val Ser Leu Gly Ser Met Trp Asp His Phe Tyr Arg 
340 345 350 

Ala Arg Arg Leu Glu Glu Leu Gly Ala Gly Leu Arg Leu Pro Ser Gly 
355 360 365 

Glu Leu Thr Ala Glu Gly Leu Arg Thr Arg Leu Glu Arg Val Leu Gly 
370 375 380 
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Ala Leu Ser Asp Thr lie Ala 

400 

tu Glu Glu Leu Thr 
415 



Glu 
385 


Pro Ser 


ru6 uiy inr fi±a 
390 




Ulll r\J.a ijCU 

395 


Ala 


Glu Pro 


Ser Pro Ser Glu 
405 


Val 


Val Pro Val 
410 


Gly 


Arg His 


Arg Pro Gly Thr 
420 


Arg 




<210> 
<211> 
<212> 
<213> 


15 

139 

PRT 

Streptomyces nogalater 


ATCC 


27451 


<220> 
<223> 


"translate of snoL, function: unknown" 


<400> 


15 









Met 
1 


Ser 


Thr 


Thr 


Trp 


Asn 


Arg 


Trp 
20 


Val 


Val 


His 
35 


Tyr 


Val 


Arg 
50 


Arg 


Met 


Asp 
65 


Val 


Arg 


Ser 


Thr 


Cys 


Ser 


Ala 


Gly Arg 


Lys 


Val 
100 


Ala 


Gly 


Lys 
115 


Val 


Phe 


Arg 
130 


Asp 


Leu 


<210> 
<211> 
<212> 
<213> 


16 

155 

PRT 

Stroptomyce, 


<220> 
<223> 


"translate i 
cluster" 


<400> 


16 






Met 
1 


Ser 


Val 


Arg 


Ala 


Thr 


Asp 


Pro 
20 


Tyr 


Ala 


Arg 
35 


Gin 



5 10 15 

Asp Val Ser Gly Val Val Ala His Trp Ala Pro Asp 
25 30 

Asp Asp Glu Asp Lys Pro Val Ser Ala Glu Glu Va3 
40 45 

Asn Ser Ala Val Glu Ala Phe Pro Asp Leu Arg Lev 
55 60 

lie Val Gly Glu Gly Asp Arg Val Met Leu Arg Il€ 
70 75 80 

Thr His Gin Gly Val Phe Met Gly lie Ala Pro Thi 
85 90 95 

Arg Trp Thr Tyr Leu Glu Glu Leu Arg Phe Ser Glv 
105 110 

Val Glu His Trp Asp Val Phe Asn Phe Ser Pro Let 
120 125 

Gly Val Val Pro Asp Gly Leu 
135 



10 15 

His Leu Tyr Ala Gin Val Gin Glr 
_ 25 30 

Leu Asp Ser Gly Ala Ala Glu Gli 
40 45 



WO 00/24775 



PCT/FI99/00870 



24 



Ala 


Ala 
50 


Thr 


Phe 


Thr 


Glu 


Asp 
55 


Gly 


Thr 


Phe 


Ala 


Arg 
60 


Pro 


Ser 


Ser 


Pro 


Glu 
65 


Pro 


Ala 


Arg 


Gly 


His 
70 


Ala 


Glu 


Leu 


Aia 


Aia 
75 




nla 


Arg 


rlX ct 


nla 

80 


Ala 


Glu 


Arg 


Leu 


Ala 
85 


Ala 


Glu 


Gly 


Leu 


Ser 
90 


HIS 


Arg 


nib 


val 


-L -L fc; 

95 


vj J. y 


Met 


Thr 


Ala 


val 
100 


Arg 


Arg 


Pin 

Vj J. VI 


Pro 


105 


Gly 


Ser 


Val 


Phe 


Val 
110 


Arg 


Ser 


Tyr 


Ala 


Gin 
115 


val 


Phe 


Ala 


i nr 


Arg 
120 


Arg 




VJ J_ U 


Ala 


Pro 
125 


Arg 


Leu 


His 


Leu 


lie 
130 


Cys 


Val 


Cys 


Glu 


As P 
135 


Val 


Leu 


Val 


Arg 


Glu 
140 


Gly 


Pro 


Gly 


Leu 


Lys 
145 


Val 


Arg 


Glu 


Arg 


Val 
150 


Val 


Thr 


His 


Asp 


Ala 
155 












<210> 
<211> 
<212> 
<213> 


17 

281 

PRT 

Streptomyces nogalater 


ATCC 


27451 














<220> 
<223> 


"translate 


of s/ioaF 


, function: 


C-7 


ketoreductase " 






<400> 


17 






























Val 
1 


Arg 


Ala 


Met 


Thr 
5 


Asp 


Ser 


Thr 


Gly 


Pro 
10 


Arg 


Pro 


Val 


Pro 


Ala 
15 


Met 


Ser 


Pro 


Ala 


Pro 
20 


Ser 


Pro 


Thr 


Pro 


Ser 
25 


Pro 


Gly 


Pro 


Ala 


Pro 
30 


Gly 


Ser 


Glu 


Pro 


Ala 
35 


Pro 


Leu 


Ala 


Val 


lie 
40 


Val 


Thr 


Gly 


Gly 


Gly 
45 


Ser 


Gly 


He 



Gly Arg Ala Thr Ala Arg Ala Phe Ala Ala Gin Gly Ala Lys Val Leu 
50 55 60 

Val Val Gly Arg Thr Glu Asp Ala Leu Ala Gin Thr Ala Glu Gly Cys 
65 70 75 80 

Ala Asp Met Arg Val Leu Val Ala Asp Val Ala Ser Pro Asp Gly Pro 
85 90 95 

Gin Ala Val Val Asn Ala Ala Leu Arg Glu Phe Gly Arg He Asp Val 
100 105 110 

Leu Val Asn Asn Ala Ala Val Ala Gly Met Glu Thr Leu Gin Thr Val 
115 120 125 

Asp Arg Asp Ala Val Ala Arg Gin Phe Gly Thr Asn Leu Thr Ala Pro 
130 135 140 

Leu Phe Leu Val Gin Ser Ala Leu Gly Ala Leu Glu Lys Ser Arg Gly 
145 150 155 160 

He Val Val Asn Val Gly Thr Ala Ala Thr Leu Gly Leu Arg Ala Ala 
165 170 175 

Pro Thr Gly Ala Leu Tyr Gly Ala Ser Lys Val Ala Leu Asp Tyr Leu 
180 185 190 
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Thr Arg Thr Trp Ala Val Glu Leu Ala Pro Arg Gly lie Arg Val Val 
195 200 205 

Gly Val Ala Pro Gly Val lie Asp Thr Gly lie Gly Val Arg Met Gly 
210 215 220 

Met Thr Pro Glu Gly Tyr Arg Glu Phe Leu Thr Gly Met Gly Gly Arg 
225 230 235 240 

Val Pro Val Gly Arg Val Gly Arg Pro Glu Asp Val Ala Trp Trp lie 
245 250 255 

Val Gin Leu Ala Arg Pro Glu Ala Gly Tyr Ala Thr Gly Met Val Val 
260 265 270 

Pro Val Asp Gly Gly Leu Ser Leu Val 
275 280 

<210> 18 
<211> 190 
<212> PRT 

<213> Streptomyces nogalater ATCC 27451 
<220> 

<223> "translate of snoN, function: unknown" 

<400> 18 

Val Gin Glu Thr Glu Pro Gly Val Pro Ala Asp Leu Pro Ala Glu Ser 
15 10 15 

Asp Pro Ala Ala Leu Glu Arg Leu Ala Ala Arg Tyr Arg Arg Asp Gly 
20 25 30 

Tyr Val His Val Pro Gly Val Leu Asp Ala Gly Glu Val Ala Glu Tyr 
35 40 45 

Leu Ala Glu Ala Arg Arg Leu Leu Ala His Glu Glu Ser Val Arg Trp 
50 55 60 

Gly Ser Gly Ala Gly Thr Val Met Asp Tyr Val Ala Asp Ala Gin Leu 
65 70 75 80 

Gly Ser Asp Thr Met Arg Arg Leu Ala Thr His Pro Arg lie Ala Ala 
85 90 _ 95 

Leu Ala Glu Tyr Leu Ala Gly Ser Pro Leu Arg Leu Phe Lys Leu Glu 
100 105 110 

Val Leu Leu Lys Glu Asn Lys Glu Lys Asp Ala Ser Val Pro Thr Ala 
115 120 125 

Pro His His Asp Ala Phe Ala Phe Pro Phe Ser Thr Ala Gly Thr Ala 
130 135 140 

Leu Thr Ala Trp Val Ala Leu Val Asp Val Pro Val Glu Arg Gly Cys 
145 150 155 160 

Met Thr Phe Val Pro Gly Ser His Leu Leu Pro Asp Pro Asp Thr Gly 
165 170 175 

Asp Glu Pro Trp Ala Gly Ala Phe Thr Arg Pro Gly Glu He 
180 185 190 



